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I, JUSWINDER SINGH, Ph.D. hereby declare that: 



1 . I am one of the named inventors of the 



above-identified patent application. 



2. I am a scientist. at Biogen Idee MA Inc. 
Currently, I am the Associate Director of Computational 
Drug Design at Biogen Idee MA Inc. I have a Ph.D. degree 
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in Rational Drug Design from the University of London. 
My CV is attached at Tab A. 

3. I have reviewed the January 27, 2005 
Office Action in the above-identified application. I 
have also reviewed claims 39, 42 and 43 in their form as 
I understand them to be at this time, as well as at the 
time of that Office Action. 

4. I understand that the Examiner has 
rejected claims 39, 42 and 43 as being obvious in view of 
United States patent 6,703,486 (" Mandel ") . I understand 
that the Examiner relies on Mandel for teaching: 

— a computer for producing a 3-D 
representation of binding and active 
sites of a protein; 

— that such a computer may comprise a 
variety of programs for 3-D modeling 
based on atomic or X-ray coordinates; 

— that the data may be carried on 
computer-readable media; and 

-- that a monitor may be used for 
displaying results. 

I also understand that the Examiner believes: (1) that 

the crystal structure coordinates of CD40L recited in the 

claims do not functionally interact with the computer or 

the programs stored in the computer; and (2) that the 

computer or the programs are not structurally or 
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functionally changed by the coordinates of CD40L. As a 
result, the Examiner considers the crystal structure 
coordinates of CD40L to be nonfunctional descriptive 
material, which do not distinguish the claimed computers 
from the computers of Mandel. 

5. I make this declaration to explain why the 
computer according to claims 39, 42 and 43, or a program 
contained in such a computer, interacts functionally with 
the novel crystal structure coordinates of human CD40L 
set forth in those claims and is functionally changed by 
them. As I explain below, a person of ordinary skill in 
the art as of the filing date of the pending application, 
and each of the applications to which this application 
claims priority, would appreciate that the coordinates of 
CD40L set forth in this application functionally interact 
with the . computer, or the program contained in it, to 
produce a model which is dependent on the coordinates 
contained therein. Consequently, the crystal structure 
coordinates of CD40L should not be considered 
nonfunctional descriptive material and, as a result, do 
distinguish the computers of claims 39, 42 and 43 from 
the computers of Mandel * 



3 
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6. Claims 39, 42 and 43 refer to a computer 
comprising a program for producing a 3-D representation 
of the specified novel binding site, molecule or 
molecular complex of human CD40L . The claims further 
state that the computer comprises a computer-readable 
data storage medium comprising a data storage material 
encoded with computer-readable data, wherein the data 
comprises specific and novel structure coordinates of 
CD40L. 

7. The previously unknown structure 
coordinates of CD40L first described in the pending 
application were generated by using the technique of X- 
ray crystallography. Scientists use X-ray 
crystallography to solve and provide novel molecular 
structures of macroraolecules . 1 The molecular structures 
solved by X-ray crystallography can be displayed in a 
computer as a three-dimensional (3-D) representation. 
The 3-D representations displayed by computers provide 
scaffolds for in silico drug design useful in finding new 
drugs for treating diseases. Only a 3-D representation 

1 See, e.g., Cantor, C.R. and Schimmel, P.R., Biophysical 
Chemistry, Part II: Techniques for the Study of 
Biological Structure and Function, W.H. Freeman and 
Company, Chapter 13, pp. 687-791 (1980), attached at Tab 
B. 

4 
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of the structure coordinates for a particular protein, 
e.g. CD40L, can act as a scaffold for drug design based 
on that protein. No other 3-D representation is capable 
of substituting as a scaffold for drug design using that 
specific protein as a target. 

8. Hence, a computer that produces a 3-D 

representation of a particular protein, e.g., CD40L, is a 

functionally unique entity as compared to a computer that 

produces a 3-D representation of another protein. For 

instance, only a computer and the structure coordinates 

of a particular protein can be used to perform the in 

silico manipulations on that particular protein . Thus, 

using a computer of the present application, one of 

ordinary skill in the art can mutate side-chains, perform 

alterations of bond lengths or angles, interactively 

calculate distances to identify interactions within and 

between atoms as the model is changed, vary conformations 

of amino acid residues, build ligands to fit into binding 

pockets, dock molecules into binding sites, and perform 

energy minimizations for the CD40L protein and complexes 

containing that protein . It is important to understand 

that, as illustrated in the present application, the 

novel structure coordinates of CD40L actually 

functionally integrate with the computer to provide a 

5 
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dynamic 3-D representation of that particular protein. 
That representation in turn has a function: it can be 
used as a scaffold for drug design based on the 
particular structure of CD40L. Structure coordinates 
that are integrated with computers, as presently claimed, 
are crucial to modern-day drug design. 

9. Furthermore, in modern-day science, the 
structure of a particular target protein need not even be 
displayed for a scientist to conduct in silico drug 
design on that protein. However, the computer must 
contain the specific structure coordinates of a 
particular molecule or molecular complex and have the 
ability to display that structure. 

10. The pending application, for the first 
time, provided computers for producing novel 3-D 
representations of the specified binding sites, molecules 
or molecular complexes of CD40L, by converting the 
structure coordinates of CD40L into those 
representations . 

11. The computers of the pending application 

are functionally changed by the crystal structure 

coordinates of CD40L as follows. The computers contain 

computer program (s) for displaying a 3-D representation 

6 
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of a specif ied molecular structure, i.e., CD40L. See, 
for example, the description of how Molecular Graphics 
manipulations are performed with QUANTA software run on a 
Silicon Graphics Indigo2 computer (see, e.g., page 35, 
lines 29-32 of the application) . Computer programs for 
visualization and molecular modeling, such as QUANTA (see 
the application at page 35, lines 29-32; and page 26, 
lines 14-21) and Sybyl (see the application at page 26, 
lines 14-21) are described as well. A molecular model 
for a novel protein is dependent on novel structure 
coordinates. The computer program(s) interrelate and 
integrate the computer and the novel structure 
coordinates to produce novel 3-D representations of 
molecular structures. 2 One of ordinary skill in the art 
can then visually inspect and dynamically interact with 
the 3-D representation of the particular molecular 
structure and use that representation for drug design. 
By means of such computer program <s) , the computers of 
the present application read the specified coordinates of 
CD40L from memory and upon recognizing each coordinate, 
cause a novel 3-D representation of the specific binding 



2 See Lesk and Hardman, Methods in Enzymology, Vol. 115: 
pp. 381-390 (1985), attached at Tab C. 

7 
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site, molecule or molecular complex of CD40L to be 
constructed in a proper axis system. 

12. Thus, a computer according to the present 
application displays the particular structure of CD40L 
dictated by the structure coordinates of that protein. 
Those structure coordinates enable the computer to 
display 3-D representations of particular CD40L molecular 
structures. Such a computer may then be used for drug 
design involving a particular CD40L molecular structure. 
In contrast, another computer that is identical, except 
that it does not contain the claimed novel set of CD40L 
structure coordinates, would not display the CD40L 
molecular structure and could not be used for drug design 
involving that structure. Therefore, a computer 
containing the structure coordinates for CD40L is endowed 
with function. 

13. A computer of the present application is 
very different from a computer /machine which produces 
images and sounds from a computer-readable data storage 
medium containing a musical, artistic or literary work. 
The former produces a specific 3-D representation based 
on the structure coordinates of CD40L, which can 
interactively and dynamically be used for drug design 

8 
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involving the CD40L structure defined by those 
coordinates. As explained in detail above, such a 
computer can be used to manipulate the structure of 
CD40L, e.g., by mutating side-chains of the CD40L 
protein, changing conformation of amino acid residues of 
the CD40L protein, building ligands to fit into the 
binding pockets of the CD40L molecular structure, docking 
molecules into binding sites of the CD40L protein and 
performing energy minimizations for CD40L and its 
complexes. In contrast, a computer disk with music, a 
painting or a literary work stored on computer-readable 
data storage medium is designed only to be read by the 
computer and to produce sound or static two-dimensional 
visual or print images. In other words, unlike the 
computers of the present application, such a computer 
simply translates the data and does nothing more. Also, 
unlike the computers of the present application, such a 
computer does not produce an interactive, dynamic entity 
capable of functional use. 

14. A computer according to the present 

application and the set of CD40L structure coordinates 

combine as a whole to produce a functional relationship 

and a functional entity. In this sense, the structure 

coordinates of CD40L are integrated into the computer, 

9 
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and in fact, do functionally interact with the computer, 
and the computer is functionally changed by those 
coordinates , 

15. For all of these reasons, there is no 
question that a functional relationship exists between 
the computers of claims 39, 42 and 43 and the stored 
structure coordinates of CD40L specified therein. The 
structure coordinates of CD40L integrate with those 
computers to produce this functional relationship. The 
computers are novel and represent an advancement in the 
art beyond the computers of Mandel. This is so because 
Mandel ' s computers do not contain the structure 
coordinates of CD40L recited in claims 39, 42 and 43 and, 
as a result, they are of no use for drug design targeting 
CD40L. 



10 
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16. 



I declare further that all statements made 



herein of my own knowledge are true and that all 
statements made herein on information and belief are 
believed to be true; and further, that these statements 
were made with the knowledge that willful false 
statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001, Title 18, 
United States Code, and that such willful false 
statements may jeopardize the validity of this 
application and any patent issuing thereon. 




inder Singh, Ph.D. 
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13 

X-ray crystallography 



13-1 X-RAY SCATTERING BY ATOMS AND MOLECULES 

X-ray diffraction is the most powerful technique currently available for studying the 
structure of large molecules. In many cases, x-ray diffraction studies on protein or 
nucleic acid crystals have yielded the complete tertiary structure at a level of resolu- 
tion of 3 A or better. If only a less- well-ordered sample (such as an oriented fiber) is 
available, x-ray diffraction still provides a wealth of structural information. Though 
insufficient to determine the structure uniquely, this information in many cases can 
provide decisive tests of structural models. Here we develop the theory of x-ray 
diffraction and describe some of the steps involved in obtaining structures from 
diffraction data. 



Outline and limitations of our treatment 

As one might expect, a technique that can provide so many structural details is 
intrinsically rather complex. We omit as many of the complications as possible and 
try to focus on the essential features of the method. Thus, atoms are treated as motion- 
less, even though in crystals there is appreciable motion at finite temperatures. Crystals 
are treated as perfectly ordered arrays, even though they may actually be ordered only 
in local domains, X-ray radiation is treated as monochromatic, even though a distribu- 
tion of wavelengths is always used in practice. Finally, diffraction data are considered 
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to be very precise, even though experimental errors often are a significant problem in 
practice. 

To understand x-ray diffraction, one must know how x rays interact with atoms 
and the manner in which atoms can be organized into crystals. Most traditional 
descriptions of the technique start with a discussion of the symmetry and structure 
of crystals. Diffraction of x rays is described in terms of reflections from crystal planes. 
The structure of molecules within the crystal is introduced into the discussion only 
later. The reader probably has seen this approach before in more elementary texts. 
Here we use a different approach, elaborated by H. Upson and C. A. Taylor (1958). 
The x-ray scattering of single atoms is explained. Then we build in complexity to 
describe the x-ray scattering of sets of atoms (one-dimensional arrays) and, finally, 
of the three-dimensional arrays found in crystals. Although this treatment requires 
somewhat more sophisticated mathematics, there seems to be a consensus among 
practicing crystallographers that it ultimately affords much greater insight and 
understanding. 



X rays: short-wavelength electromagnetic radiation 

X rays are photons with wavelengths in the range of 0.1 A to 100 A. They usually are 
generated by bombarding a target with electrons of energies of 10,000 electron volts 
(eV) or more. Upon collision, these high-energy electrons can knock electrons out of 
the target atoms, leaving vacancies in atomic shells. If, for example, a vacancy is 
produced in the innermost (K) shell of an atom, it rapidly will be filled by an electron 
descending. from the next (L) shell, or one from the one after that (M). The photons 
emitted as a result of these transitions are called, respectively, K s and K p x rays. 
Their wavelengths are 

;. Ka = hc/(E L -E K ) and /. Kf = hc/(E M - E K ) (13-1) 

where h is Planck's constant, c is the speed of light, and E refers to the energy of a 
particular state (K, L, or M). Typical x rays used in structure determination are Cu 
K a (A = 1.54 A) and Mo X a (A = 0.71 A). 



Parameters that describe an electromagnetic wave 

X rays, like any other photons, are electromagnetic waves. A general expression for 
the propagation of one such wave in the k direction through space and time is 

E(r,t) = E 0 e 2Kilil/, -- v ' +i) 

= £ 0 {cos[27i(k • r/A - vr + <5')] + i sin[27t(fc • r/A - vt + 5')]} " (13-2) 



Amplitude 
A 

.Phase: S' cvcles. kd' cm 




I = 0 I = T 

B /f«</5 by <) cycles B leads by <5 



(b) 

Figure 13-1 

Characteristics of electromagnetic waves, (a) The electric field amplitude as a function of distance at 
time zero, (b) The relative phase of two waves remains constant with time. The phase difference is 6 
cycles US cm) both at t = 0 and at r. 



where £(r, r) is the electric field at point r and time r; k is a unit vector in the k direc- 
tion; /. is the wavelength in cm cycle" 1 ; v is the frequency in cycles sec" 1 past a fixed 
point; 5' is the phase of the wave (in cycles) that defines its amplitude at r = 0 and 
t = 0; and £ 0 is the maximal amplitude (Fig. 13-1). Such a transverse wave oscillates 
periodically in both time and space. 

It would be equally accurate to describe the wave by a real function, such as 
sin[27r(k • r/A - vf + 8 % rather t han b y a complex f unctio n (Box 1 3-1). Howe ver, 
the measured radiation intensity of a wave depends on the square of the amplitude, 
and this always will be a real quantity. We choose to describe x rays by complex 
exponentials because of the great mathematical convenience of working with such 
functions. For example, e°* b = e a e b , whereas sin(a + b) = sin a cos b + cos a sin b. 

Two waves propagating in the same direction with the same amplitude, wave- 
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length, and frequency can differ only in phase. We can describe them as 

E 1 (r,t) = E 0 e 2 " i * Ti *-" +d ' t) 

£ 2 (r,r) = E 0 *™*-r/i-w + «i) = E^e 2 * 1 ' 

where <5 = 5' 2 - S\ is the phase shift. Note that S is constant for all space and time If 
two such phase-shifted waves are combined, the net amplitude is £ a (r, f)(l + e 2mi ). 
When <5 is zero, this net amplitude is just twice the individual amplitude but, when 5 
is one-half cycle, the net amplitude is zero because e in is - 1. Clearly, in situations 
where an observable is a superposition of many waves, their relative phases are quite 
critical. 



Geometry of an x-ray scattering experiment 

Consider the geometry of the typical x-ray scattering experiment shown in Figure 
13-2a. A collimated beam of x rays is allowed to impinge on a sample consisting of a 



Box 13-1 RELATIONSHIP BETWEEN SINES, COSINES, AND EXPONENTIALS 

It is possible to express periodically varying functions either in terms of sines and cosines or 
as complex exponentials. The basic relationship between these two representations is 

e ix = cosx + i sin x 

One easy way to justify this relationship is to expand each of the functions in an infinite series: 
e ix = 1 + ix - x 2 /2! - fx 3 /3! + x 4 /4! + ix 5 /5! 
cos x = 1 - x 2 /2! + x 4 /4! - x 6 /6\ + ■•■ 
i sin x = ix - ix 3 /3 ! + ix 5 /5 ! - ix 7 /7 ! + • ■ ■ 

Because cos(-x) = cos x, and sin(-x) = -sin x, it is obvious that 

e~ ix = cos x - i sin x 

Therefore, we can always represent trigonometric functions in terms of complex exponentials 
as follows: 

cos x = (l/2)(e ix + e' ix ) 
sin x = (l/2i)(e" - e~ ix ) 
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single electron located at the origin of the coordinate system. A unit vector, s 0 , 
describes the direction of the incoming radiation. Scattering will deflect a certain 
fraction of the incident x rays, and will lead to radiation propagating away from the 
sample in all directions. Suppose we could place an x-ray detector at some location 
in space and measure the amplitude and phase of radiation scattered in that direction. 
The position of the detector is denoted by another unit vector, s. The scattering angle 
0 is defined as one-half the angle of deflection of s relative to s 0 (Fig. 13-2a). We are 




Path d ifference = r • § - r • § 0 

(c) 

Figure 13-2 

X -ray scattering by a single electron. The angle of deflection (28) between the source and the detector is 
the same in all three cases, (a) An electron at the origin, (b) An electron located at position r relative 
to the origin, (c) An expanded view near the origin, showing the path difference between radiation 
scattered by an electron at r and that scattered by an electron at the origin. The numbers shown in 
parts a and b measure the x-ray path in units of wavelength. The vectors s 0 and s are unit vectors 
describing the direction of incident rays and that of scattered rays seen by the detector, respectively. 
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concerned here only with the elastic scattering of x rays. This means that the wave- 
length of the incident and scattered radiation is the same. 

The intensity of x-ray scattering will depend on the orientation of the sample 
relative to the incident and scattered rays. It is convenient mathematically and, as 
you will see shortly, very convenient conceptually to define a new single variable S, 
called the scattering vector: 

S = (s/A) - (s 0 /A) (13-3) 



Figure 13-3a shows the meaning of S. The direction of S bisects the angle between 
incident and scattered radiation. The dimensions of S are inverse length, so that S 
measures the number of cycles of radiation per cm. The length of S is a function of 
the total scattering angle (Fig. 13-3b). 



S-S = (s 2 + sg-2s-s 0 )// 2 

= 2(1 - cos 26)/a 2 = (4 sin 2 0)/A 2 



(13-4) 



Therefore the length of S is 




(13-5) 



Scattered 



Scattered 



Incident 



(a) 




s 0 . s 



$ > ISI = 0 



Length 1/;. 



s 0 s 




isi = : 



(c) 



Figure 13-3 

Basic geometry of an x-ray scattering experiment. The unit vectors s 0 and s are iiefined in Figure 1 3-2. 
(a) The scattering vector S is defined by Equation 13-3. (b) When the unit vector s describing the 
direction of scattered radiation is translated a distance 1/A along the s 0 axis, it points directly toward 
the tip of the scattering vector S. (c) Arrangements of s 0 and s that lead to maximal and minimal values 
of ISI. 
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The value of |S| can vary from 0 to 2// (Fig. 13-3c). Thus, the vector S is described 
in a finite coordinate system in which each axis has the dimensions of a reciprocal 
distance. This coordinate system is called reciprocal space. Like any other coordinate 
system, the space containing S can be expressed by many different possible axes. 
We derive later a particularly convenient representation that allows S to be related 
to the axes of a crystalline sample. 



Scattering as a function of electron position 

The radiation £(S) seen by the detector (Fig. 13-2a) that results from the scattering of 
a single electron at the origin can be computed by a proper consideration of the 
quantum mechanics of photons interacting with matter. § If we had more than one 
electron located at the origin, the scattered radiation at any angle should simply 
increase in amplitude in direct proportion to the number of electrons. 

In crystallography, one is interested not so much in the scattering properties of 
individual electrons as in the effect of relative electron position on the pattern of 
scattering. Therefore, we can simply ask how the scattering changes as an electron 
is moved away from the origin. The structure factor, F(S), is defined as the ratio of the 
radiation scattered by any real sample to that scattered by a single electron at the 
origin. 

Suppose a sample contains a single electron located at position r, instead of at 
the origin (Fig. 13-2b). The source and detector are very far away from the sample, 
and are large compared to r. Therefore, to a very good approximation, the scattering 
angle, 6 = (1/2) cos" *(§ ■ s 0 ), is the same for this sample as it is for a sample with an 
electron at the origin. The only difference in the two samples is the path length that 
x rays must travel from source to sample to detector. This path length is simply 
(s - s 0 ) ■ r (Fig. 13-2c). Such a path length is equal to (s - s 0 ) • r/x = S • r cycles, for 
x rays of wavelength L Therefore, if the radiation scattered by an electron at the 
origin is £(S), moving the electron from the origin to a position r simply causes a phase 
shift of S • r cycles. The scattered radiation is E{S)e 2niS * r , and the structure factor F(S) 
is e 2niS r . 

In general, because electrons are not localized, it is better to describe an electron 
density p(r) in a volume element dr, located at r; the scattering then is proportional 
to p(r) dr. For continuous electron density at position r, the structure factor is 

F(S) = p(r)e 2 *' s <dr (13-6) 

where p(r) dr is the number of electrons in the volume element dr. 

A sample with many discrete scattering sites has a structure factor that is simply 
a sum over many terms corresponding to Equation 13-6. For a continuous electron 



§ The result shown in Figure 13-2a contains one serious oversimplification. In actuality, all scattered 
radiation experiences a phase shift of one-half cycle relative to the phase of the exciting radiation. We can 
ignore this. 
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distribution, the sum is replaced by an integral: 




(13-7) 



The integral is taken over the entire sample. Equation 13-7 is the single fundamental 
equation that governs all x-ray scattering and diffraction. If the electron density 
distribution p(r) of a sample is known, one can compute the structure factor, and from 
this one can compute the expected x-ray scattering for all scattering geometries. 



X-ray scattering in terms of Fourier transforms 

The mathematical form of Equation 13-7 is equivalent to a Fourier transform. This 
is an integral with very convenient properties (Box 13-2). Note that, outside the 
sample, p(r) is zero. Therefore, the integral in Equation 13-7 can be extended over all 
space without changing its value. Thus, the physical meaning of Equation 13-7 is that 
the structure factor is a Fourier transform of the object. 

Because F(S) is the Fourier transform of p(r), a second Fourier integral must 
exist that relates these two quantities. This is the inverse Fourier transform: 



p( r) = (l/K) jdSe- 2niS r F(S) 



(13-8) 



The integral is taken over all reciprocal space. V is a constant that contains (27i) 3 and 
other constants that compensate for the difference in the unit of volume of sample 
space r and reciprocal space S. In what follows, we sometimes ignore the constant V. 

Equation 13-8 means that, if one had measured or calculated values of F(S) 
extending over all reciprocal space, one could readily compute the electron density 
distribution of the object. Thus, Equations 13-7 and 13-8 form a relationship that 
lets one interconvert structure factors and electron densities freely, providing each 
is known over all space. It is similar in spirit to the relationship given by the Kronig- 
Kramers transforms in Chapter 8, which let CD and ORD data be interconverted. 



An example of the properties of Fourier transforms 



To illustrate the properties of Equations 13-7 and 13-8, we shall derive the latter 
from the former. Set up the integral Z(r') = J F(S)dSe~ 2niS '' in some new coordinate 
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system r'. Substitute for F{S) from Equation 13-7: 

7(r') = jdSe' 2 *®-*' jdrp(r)e 2niS ' r (13-9) 
We can exchange the order of the two integrals to write 

/(r') = Jdrp(r) jdSe' 2nls ' r 'e 2M ' r = jdrp(r) fdSe 2 **'*'" (13-10) 

The integral over dS in the right-hand expression of Equation 13-10 has a very unusual 
property. As shown in Box 13-3, it is the Dirac delta function: 

S(t-t') = jdSe 2 * iS{r - T) (13-11) 

This function has the following characteristics. If r ^ r', then <5(r - r') = 0. If r = r', 
then 5{r - r') = oo. However, J dr S (r - r') = 1 and [for some arbitrary function 
0(r)] J drg(r)S(r - r') = g(rj if the integrals include the point r = r\ Thus, <5(r - r') 
will simply sample a function at r = r\ Equation 13-10 becomes i(r') = p(r'). Recog- 
nizing that r and r' are equivalent variables, this result is identical to Equation 13-8 
except for the constant V. 



Measuring the structure factor 

Unfortunately for x-ray scattering studies, no way is known to measure F(S) directly. 
F is a complex number that can be written as the product of two terms, 




(13-12) 



or as the sum of real and imaginary parts, 

F = F T + iF { 



(13-13) 



The term |F| is called the amplitude of the structure factor, and e i4> is the phase. 
Figure 13-4 shows the relationship between the two representations of F(S). 

F r = |F|cos<£; Fi = |F|sin4> (13-14) 
\F\.= (F r 2 + F?) 1/2 ; 4> = tan-^W) (13-15) 
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Representing a Function by a Fourier Series 

Consider a completely arbitrary function f{9\ defined in the interval 9 = -n to 6 = n. It 
is possible to represent this function as an expansion in a series of functions with known 
properties. Only certain sets of functions are suitable for such an expansion and, in the interval 
- 7i to 7i, sines and cosines together constitute such a set : 

cc 

f(9)= X a n cos(n0) + d n sin(*0) 

where the index n runs through all positive integers. This expansion is called a Fourier series. 
The coefficients a n and a' n are numbers determined by the properties of f(9). 

As shown in Box 13-1, sines and cosines can be expressed in terms of complex exponentials. 
Therefore, the Fourier series just given can instead be written as 

f(9) = t Ke™ 

n= - x 

where the index n now runs through both positive and negative values because these are 
necessary to describe sines and cosines. The coefficients b n can be found in a simple way by 
making use of the following result. 
For any two integers n and m, 

J* e ine e -ime dg = £ e i(n-m)B = _ m )]( e '<" _ 

= [2/(n - m)*] sin(n m)n = 0 if n # m 

= 2n if n = m 

where the result for n = m can be proven by expanding the sine expression in a power series. 
Therefore, to find a particular b mi one performs the integral 

(l/27t) I"" f(6)e- ime d9 = (l/2n) f" d9 £ b n e inB e~ ime = b m 

Note that the integral is carried out over the entire range of 9 over which f{9) is defined. 
It often is convenient to be able to work with an arbitary range - L/2 to L/2 rather than 
with -7i to 71. This is accomplished by defining a new variable, x = L0/2tt, such that when 
9 = n, then x = L/2, and when 9 = -n, then x = -L/2. Incorporating this variable into the 
above equations, and using the fact that dx = (L/2n)d9, we obtain 

/(*)= t Ke 2ninxlL 

n= — x 

- b n = (l/L)^ !2 e- 2 " i " x ! L nx)dx 
Fourier Transforms in One Dimension 

The function f(x) is defined at all x, whereas the set of coefficients b n represents an infinite 
array of numbers, which must be tabulated. Therefore, it is convenient to find an analog of 
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the Fourier series in which the coefficients b n are replaced by a function, and the summation 
is replaced by an integral This representation is called a Fourier transform when the interval 
over which the function is defined extends from - oc to + oo. 

We define a new continuous variable, S = 2nn/L, and a new continuous function g(S) = 
Lb n . Using these, the equation for b n is transformed to 

g(S) = j^e- 2 " iSx f(x)dx (A) 
in the limit as L -+ oo. The series expansion for f(x) becomes 

/(*)= t l9(S)/L]e 2niSx 

n= — x 

To replace the sum by an integral, note that the interval AS corresponds to {2n/L)An from 
the definition of S. But An = 1 in the summation, and therefore each increment dS in an 
integral is equivalent to 2n/L in the sum. Thus, 

/(x) = (L/2n) [g(S)/L]e 2 « iS * dS = (1/2*) J_*« g(S)e 2 « iS * dS (B) 

Equations A and B constitute a pair of Fourier transforms that allow f(x) to be cal- 
culated if g(S) is known, and vice versa. They are particularly interesting because the variables 
x and 5 have opposite dimensions. For example, if x is distance, then S is reciprocal distance. 
The factor of (1/2*) in equation B often is written instead as (Ifsfln) in front of the integrals 
in both equations A and B. 

Fourier Transforms in Three Dimensions 

Suppose the function / is now defined in a Cartesian coordinate system with axes x,>\r. For 
fixed y and r, the function f(x, y, z) can be expanded in a Fourier series in e 2niSxX , and the 
Fourier transform becomes (by analogy to Equation A) 

fc(SJ = e- 2 " iS * x f(x 9 y,z)dx 

This expression, in turn, can be expanded in the function e 2niSyy for fixed z, and finally as a 
function of e 2niS * z . The resulting three-dimensional Fourier transform is 

g(S x ,S y ,S z ) = J^ die' 2 "*** J^ dye-™" J.", dxe'™^ y 9 z) 

If we use the vector S to represent the three variables S x , S y , and 5 Z) and we use r to represent 
x, y 9 and z, then the three-dimensional transform can be written very compactly as 

S(S) = j^dre- 2 " iSr f(r) 

Similarly, the analog of Equation B becomes 

/(r) = (l/27 l ) 3 j^ dSe 2niS r g(S) 



Imaginary axis 



Figure 13-4 

The structure factor plotted as a vector 
in the complex plane. 




Real axis 



Box 13-3 THE DIRAC DELTA FUNCTION 

We wish to demonstrate that the following integral is a representation of the one-dimensional 
Dirac delta function : 



The results can easily be generalized to three dimensions. If this is the delta function, it must 
obey three properties. 

First, if x' = x, then S(x - x') = x. It is obvious that, with x = x\ the exponential in the 
above integral is just unity; therefore, the integral is infinite. 

Second, if x' ^ x, then 5(x - x') = 0. It is not so obvious that the integral meets this 
requirement. The way to realize that it does is to note that the complex exponential is a 
periodic function that continually oscillates from - 1 to 1 throughout all space. For each 
positive lobe there exists an adjacent (absolutely equivalent) negative lobe. The areas under- 
neath these lobes cancel identically. 

Third, if x' lies between a and b, then 





Let a = x' + £, and b = x' - e. Then the area under the delta function is 
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Experimentally, all one can observe is the intensity of radiation scattered at an 
angle 26. If we express this intensity relative to the intensity scattered by a single 
electron at the origin, it is 

/(S) = F(S)F*(S) = \F\ 2 (13-16) 

We must multiply by the complex conjugate, rather than simply by F(S), because F is 
a complex number. The intensity is a pure observable and must be real. It is given 
by the square of the amplitude of the structure factor. Thus \F\ can be measured 
experimentally. The phase term {e i4> ) of F(S) is not directly measurable; this is the 
major obstacle in x-ray scattering and diffraction studies. In order to use Equation 
13-8 to calculate p(r), one first must guess, calculate, or indirectly estimate e i(t> . 



= e - 2nix ' s [ie 2nix ' s /2itiS)2i$m2neS] dS 
= (1/tt) J*^ [(sin 2mS)/S] dS = 1 

because 

J o x [(sin x)/x] dx = [(sin x)/x] dx = n/2 

If x' is not between a and b, then the integral $dxd(x - x') is zero, because the function is 
everywhere zero. Thus we see that the integral originally given meets all the requirements, 
and is in fact the Dirac delta function. 

A most important property of the delta function is the ability to shift the location of 
another function : 

^ gc dxf(x)S{x-x') = f(x') 

We can demonstrate this by choosing a narrow interval x' - e to x' + e near x' and breaking 
up the integral into three parts: 

J'J £ dxf(x)6(x - x') + £'_ + £ £ dx f(x) S{x - x')-+ J""^ dx fix) Six - x') 

The first and third integrals are zero for any finite-valued function /(x), because everywhere 
within them <5(x - x') = 0. The second integral can be evaluated if we choose e small enough 
so that fix) = /(*'); then it becomes 
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The electron density in Equations 13-7 and 13-8 is, in principle, measurable 
directly, and therefore it must be real. As probed by x-ray scattering experiments. p(r) 
behaves as a real quantity so long as there is no anomalous scattering (vide infra). 
The reality of p(r) allows a constraint on F(S) to be developed. Because p(r) is 
real, p(r) = p*(r). Substituting Equation 13-13 into Equation 13-8, and taking the 
complex conjugate, we obtain 

J[F r (S) + iF { (S)le- 2 * iS r dS = J[F r (S) - iT s (S)> + 2 « ,s ' </S (13-17a) 

Note that r can take on any value. For Equation 13-1 7a to hold for any arbitrary 
value of r, it is necessary (for every value of S) that 

F r (S) = F r ( - S) and F { (S) - - F { ( - S) (13-1 7b) 

In other words, the real part of the scattering function must be symmetrical about 
the origin of S space, whereas the imaginary part is antisymmetric. When such a 
relationship holds, the function F(S) is called a conjugate function. 

When the results of Equation 13-1 7b are substituted into the definition of the 
intensity, an interesting result emerges: 

J(S) = |F(S)| 2 = F r 2 + Ff = \F(-S)\ 2 = J(-S) (13-18) 

Equation 13-18 reveals that the observed pattern of scattered intensity is symmetric 
about the origin of reciprocal space at S = 0. The result, that I(S) has a center of 
symmetry, is called Friedel's law. It means that one has to measure only half the 
scattering to obtain all the information it contains. 

A requirement for heterogeneities in electron density 

Suppose that an experimental sample consists of a uniform distribution of electron 
density, p(r) = p. Then the expected structure factor is 

F(S) = pJdre 2niSr (13-19) 

But this is.j.ustJh.e.D.irac.delta function <5(S. — _Q)._.The only x rays that emerge from 
the sample are F(0). From Figure 13-3, S = 0 corresponds to scattered radiation 
parallel to the incident beam. In other words, a uniform sample cannot deflect x rays 
at all, just as a medium of constant refractive index cannot bend or focus collimated 
light. 

A fundamental principle of scattering is the requirement for spatial (or temporal) 
heterogeneities. Scattering is caused by the contrast between a given region and its 
neighbors. We now must calculate the scattering that results from the presence of 
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discrete atoms, and then that resulting from arrangements of atoms found in molecules 
or crystal lattices. 



Scattering from a single atom at the origin 

Suppose the sample consists solely of a single atom located at the origin. The detailed 
pattern of electron density around an individual atom depends on the bonding it is 
involved in. However, in almost all x-ray diffraction experiments, the resolution is not 
high enough to detect this detailed pattern. Thus, it is a good approximation to model 
the electron distribution of an atom as spherically symmetric. Then p(r) becomes p(r). 
If we express Equation 13-7 in spherical polar coordinates (Fig. 13-5a), 

F(S) = d<t> £ sin 6 d9 J* drp(r)r 2 e 2 " iS ' ' 

- 2n J o * drp(r)r n £ dO sin 9 e 2 



,27tiSr cos 6 



(13-20) 



where 5 and r are the lengths of the vectors S and r, respectively. By making the 
substitution x = cos 6, we can easily evaluate the 6 integral, obtaining 



F(S) = 4n J* drp(r)r 2 [(sin 2nSr)/2nSr'] = f(S) 



(13-21) 




0 0.2 0.4 0.6 0.8 1.0 



(a) (b) sin (c) 

Figure 13-5 

X-ray scattering from atoms, (a) Coordinate system used to evaluate Equation 13-20. .(b) Atomic 
scattering factor for various atoms as a function of the scattering angle (20). [After J. P. Glusker and 
K. N. Trueblood. Crystal Structure Analysis: A Primer (London: Oxford Univ. Press, 1972).] 
(c) Coordinate system used to describe an atom not at the origin. 
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The function f(S) is defined as the atomic scattering factor. It depends only on 
|S| and thus, from Equation 13-5, it depends only on the angle between s 0 and s, and 
not on the orientation of the sample. Note that, because p(r) = p(-r), the function 
f(S) is real. Therefore, the intensity measured in a scattering experiment on a single 
atom, 7(5), can be used to compute f{S) directly by Equation 13-16: 



/(S)= ±V(S)] 



1/2 



(13-22) 



The only ambiguity is the choice of sign; we can arbitrarily define this sign to be 
positive. 

For real atoms, p(r) can be crudely approximated by a Gaussian distribution of 
electron density : p(r) = zNe~ krl , where z is the number of electrons, N is a normaliza- 
tion constant, and k is related to the width of the Gaussian. Then Equation 13-21 can 
be integrated to yield. 



(13-23) 



This relationship shows that the atomic scattering factor has the same sign everywhere 
in space. The atomic scattering factor for forward-scattered radiation {S = 0) is 
simply the number of electrons. Equation 13-23 shows that spherical atoms scatter 
x rays most efficiently in the forward directions. The scattering factor drops fairly 
rapidly with increasing scattering angle (Fig. 13-5b). 



Scattering from atoms not located at the origin 

Next, suppose the sample is still a single spherical atom, but it now is centered at the 
position r„. As before, r is a vector from the coordinate system origin to a point within 
the electron density distribution of the atom. R is a vector from the atom center to the 
point r. It is defined by r = R + r„ (Fig. 13-5c). Thus, from Equation 13-7, the x-ray 
scattering is 

F(S) = jd(R + rMK + r n )e 2niS{ * +tn) (13-24) 

3ecause-r n Js-constant,-rf(-R-^r n )-=^R, and -the -term ^^"iLcan-be^removed from 
the integral: 

F(S) = e 2niS * r " J dRp(R + r n )e 2niS * (1 3-25) 

The integral in Equation 13-25 is taken over all space. Because p is the electron 
density distribution around the atom, the constant vector r„ specifying the original 
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coordinate system is irrelevant. Thus, this integral is identical to Equation 13-20. 
It is just the atomic scattering factor, and so Equation 13-25 becomes 



F(S) = f(S)e 



ZitiS • r„ 



(13-26) 



For a set of N atoms, each located at position r n with atomic scattering factor 
/ n , the total structure factor expected is 



N 



F(S) = I /„ (S)e 



2niS • r n 



(13-27) 



where /„ is the scattering factor of the nth atom. If the N atoms happen to belong to a 
single molecule, then Equation 13-27 is called the molecular structure factor, F m (S). 

Consider the case of a sample with a center of symmetry. If that center is placed 
at the origin, then for each atom at r„ contributing f n {S)e 2ltiS ' r " in Equation 13-27, 
there must be an equivalent atom at -r n contributing f n (S)e~ l7ziS ' ' M . Because e ±lx = 
cos x ± i sin x (Box 13-1), the structure factor of the sample can be written as the 
centrosymmetric function: 



N/2 



FJS) = I 2f n (S) cos(2ttS • r„) 



(13-28) 



n= 1 



which is a sum over N/2 symmetry-related pairs of atoms. This is a real function, and 
therefore the problem of determining the phase of ^(S) is dramatically simplified. 
From Figure 13-4, note that <f> must be either 0 or n. Thus the term e i4> must be simply 
+ 1 or - 1 at each point S, corresponding respectively to <f> = 0 or <f> = n. 
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Interference fringes from sets of atoms 

In generalrmoving-an-atom-to-position r„ (away-from the origin) introduces a phase' 
shift, e 2 * iS ' tn , in the x-ray scattering. Note that, for a single atom, this will lead to no 
observable change in scattering because the intensity is still 1(S) = f 2 {S). Suppose, 
however, a sample contains one atom at the origin and an identical atom at position 
r„. The total structure factor will be 



F(S) = f(S)(l+e 2 **-'") 



(13-29a) 
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The scattering intensity will be 

/(S) = / 2 (S)(1 + e 2 " iSr ")(l +e- 2 " iS '") = 2/ 2 (S)[l + cos(27rS • r„)] (13-29b) 

Thus— in addition to the scattering seen from each of the atoms separately, / 2 (S)— 
there is an interference pattern generated by the cos 27rS • r„ term (see Fig. 13-6b). This 
is exactly comparable to the interference fringes seen in a two-slit experiment in 
optical diffraction (see Box 13-4). The term e 2niS Tn in Equation 13-29 often is called 
a fringe function. 

If it were possible to measure the scattering from a sample containing just a few 
atoms, the pattern of fringes would yield information on the spatial arrangement. 
Such measurements are impossible because the intensity of radiation scattered from 
just a few atoms is too small. The number of terms in the structure factor increases 
as the total number of atoms (JV T ) increases, and therefore the observed intensities 
increase as iVf • To increase the number of atoms without loss of information, it is 
necessary to work with periodic arrays of atoms, such as atoms in crystals. Here we 
shall demonstrate the pattern of fringes introduced by such arrays. 



Calculation of x-ray diffraction from a one-dimensional array 

Start with a one-dimensional row of 2N + 1 identical atoms. Locate the central atom 
at the origin of the coordinate system. As shown in Figure 13-6a, the position of each 
atom in the array is generated from that of its neighbor by translation along a vector 
a. The position of the nth atom in the array is na. The structure factor resulting from 
this atom is given by Equation 13-26: 



F n (S) = e 2ninS *f(S) 



(13-30) 



Thus the scattering from any of the atoms can be written in terms of the atomic 
scattering factor f(S) for an atom at the origin. The structure factor for the whole 
array can be written as 



F Tot (S) = /(S) £ e 

n= -N 



2ninS • a 



(13-31) 



The sum is the fringe function resulting from the array. 

We could treat a linear array of molecules in a similar way. If this array is 
generated by translation, the resulting scattering will be given by an equation iden- 
tical to Equation 13-31, except that the molecular structure factor F m (S) will replace 
the atomic scattering factor f(S). However, with molecules, more complex arrays 



• • • • • • — >• • • » # 

_5 _4 -3 -2 -1 0 1 2 3 4 5 

n 

(a) 




1 atom at origin 



S* a 




An infinite row of atoms 



(b) 

Figure 13-6 

X -ray scattering from a one-dimensional array of atoms, (a) The array, as defined by the vector 
translation a. (b) X-ray scattering intensity as a function of the number of atoms in the array. Shown 

are the actual observed scattering (blackjinel the scattering expected for a single atomjdashed line), 

alSaihTfringe'function prociuced'by theTrray {colored line). The observed scattering is the product of 
the fringe function and the single-atom scattering. Note the changes in vertical scale as the number of 
atoms increases. The horizontal scale is in units of S - a and is the same for all cases. 
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can be generated if there are rotations as well as translations relating adjacent 
molecules. These cases can be handled by simple extension of the methods used here; 
some examples are given in Chapter 14, where we discuss scattering from helices. 

Equation 13-31 is simply a geometric series with an initial term e' 2 *™*'*, a 
constant ratio e 2 * lS a , and a final term e 2 * im '\ The sum of a geometric series is 
r(l _ r m )/(l - r), where r is the ratio between terms, m is the number of terms, and 
t is the first term. Using this expression, Equation 13-31 becomes 

F Tot (S) = /(5) g •■ (13-32) 

Equation 13-32 can be simplified by multiplying both numerator and denominator 
by e~ niS a . Then 

g -»ri(2JV + l)S-a _ g«f(2\+l)S a 
F Tot(S) = f(S) e -«S a _ g*iS • a 

= sinECN+DttS-.] 

sin(7rS • a) 

We have used the fact that e ±ix = cos x ± i sin x to reach the final form of Equa- 
tion 13-33. 

The intensity of scattering from the array will be 

HA-W-V ^*»£»-* f ("-34) 

This is shown schematically as a function of the size of the array in Figure 13-6b. 
You can see that, as N becomes large, the intensity tends to zero everywhere except 
where S • a is integral. The term S • a that appears in Equation 13-34 measures the 
relative orientation of the sample and the detector. Note that the scattering intensity 
is a maximum when S • a = 0. This occurs when S is in a plane perpendicular to the 
long axis of the array. 



Discontinuous diffraction pattern from a one-dimensional array 

It is helpful to demonstrate explicitly the behavior of Equation 13-33 as N becomes 
large. For most values of S * a, the value of sin(7iS * a) lies between 0.1 and 1.0 or 
between -0.1 and - 1.0. Around these values of S ■ a, the value of sin[(2N + l)7cS - a] 
oscillates wildly between 0 and 1. Therefore, the quotient in Equation 13-33 falls in 
the range of about - 10 to 10, regardless of the value of N. However, what happens 
when sin(7rS * a) approaches zero? It is easiest to examine this behavior in the limit 
where S ■ a -* 0. If we use the series expansion for sin x — x — x 3 /3\ + * • •, and if 
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we keep only the first term as x -> 0, the quotient in Equation 13-33 becomes 
(2N + l)(nS • a)/(aS ■ a) = 2N + 1. 

In a crystalline array of molecules, N can be 10 6 or more. Therefore, the struc- 
ture factor becomes enormous each time sinfaS • a) goes to zero. This occurs every 
time S • a approaches an integer. Compared with the sharp peak in scattering for 
integral S - a, all other values are negligible. Therefore, the fringe function of a linear 
array leads to a discontinuous scattering pattern (Box 13-4 illustrates the analogous 
effect in optical diffraction). Only for certain orientations of sample and x-ray detec- 
tor will any scattering be observed at all. This result is called a von Laue condition: 

S-a = /i, where ft = 0, ±1, ±2, ... (13-35) 

The vector a is a property of the particular one-dimensional crystalline sample 
and its orientation in space. The vector S depends on the geometry of the scattering 
experiment. The observed scattering depends only on S * a and is intense only when 
Equation 13-35 is satisfied. Figure 13-7 shows the geometrical significance of this. 
S * a is the projection of S onto a. Suppose a is fixed. Then S • a = 0 means that S 
can be any vector in a plane perpendicular to a and passing through the origin 
(Fig. 13-7a). S • a = 1 means that S can be any vector from the origin to a plane 
perpendicular to a and spaced a distance 1/a away from the origin (Fig. 13-7b). For 
example, if S is parallel to a, then S * a = 1 implies that |S| = l/|a|. 




Figure 13-7 

The von Laue scattering conditions for a one-dimensional array. Scattering vectors 
are shown as solid arrows, (a) For S • a = 0. (b) For S * a = 1. 



Box 13-4 OPTICAL DIFFRACTION PATTERNS FROM ARRAYS 



The same mathematical formalism developed in the text to calculate x-ray diffraction from 
molecular arrays also applies to optical diffraction from arrays of slits or pinholes. The figures 
show the optical diffraction pattern from a series of opaque masks containing increasingly 
more elaborate arrays of pinholes. Such diffraction patterns can be created by the apparatus 
shown in Figure 10-4a by using the mask as a sample. The figures on the left show the sample 
masks used; the corresponding figures on the right indicate the diffraction patterns produced 
by the masks. 
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(a) A six-atom molecule, modeled by six pinholes, (b) Two six-atom molecules in a row. 
Note how the presence of two atoms introduces additional vertical fringes, (c) Four six-atom 
molecules. The horizontal repeat in structure leads to additional horizontal fringes, (d) A 
vertical row of many pairs of six-atom molecules. Note how the diffraction pattern sharpens 
in the vertical direction but remains broad in the horizontal direction, (e) A two-dimensional 
crystalline array of six-atom molecules. Note that the diffraction pattern is now a set of sharp 
spots, (f ) A different crystalline array of the same molecules. The smaller reciprocal lattice 
results from the larger crystal lattice. [From G. Harburn, C. A. Taylor, and T. R. Wei berry, 
Atlas of Optical Transforms (Ithaca, N.Y.: Cornell Univ. Press, 1975).] 
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-8a -6/a -4/fl -2/fl 0 (c) 
Figure 13-8 

Experimental conditions for observation of scattering from the linear array of atoms shown in Figure 13-6. 

(a) A set of parallel planes, representing the von Laue condition imposed by the array of atoms. 

(b) For a fixed direction s 0 of incident x rays, the possible scattering vectors (black) must lie on the 
surface of the sphere. (See Fig. 13-3a for further information.) (c) The intersection of the two sets of 
conditions outlined in parts a and b is shown for two different relative geometries of a and s 0 . 

By extending this argument, it is clear that integral values of S • a define a set 
of parallel planes. The spacing between these planes is 1/a (Fig. 13-8a). The set of 
parallel planes in reciprocal space defines all those values of the scattering vector 
that produce measurable intensity. Further constraints are introduced if the wave- 
length of the incident x rays is held constant, and if their direction is fixed at s 0 . 

Once a particular s 0 is Selected, the various possible observation directions s 
lead to a restricted set of possible scattering vectors S. Note from Figure 13-3b that 
the tip of the vector S always extends from the origin to a point at a distance 1/A 
along s. The locus of all points located 1/A from s will define a sphere of radius 1/A 
centered at the tail of s. Thus all possible scattering vectors S must extend from the 
origin to the surface of a sphere of radius 1/A (Fig. 13-8b). This sphere is called the 
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sphere of reflection. It is always tangent to the plane drawn through the origin 
perpendicular to the direction of incident x rays, s 0 . 

The scattering of x rays will be observed only when both the von Laue condi- 
tions and the conditions of the sphere of reflection are satisfied. This means that the 
scattering vector must lie on points formed by the intersection of the surface of a 
sphere and a set of parallel planes. As shown in Figure 13-8c, this intersection is a 
set of parallel circles. The orientation of the circles, and the identity of the particular 
planes from which they originate, depend on the angle between s 0 and a. 

Sampling the scattering from any atom or molecule in a periodic array 

For a sample consisting of a single atom, the atomic scattering factor f(S) would be 
measurable with a single sample orientation and s 0 , at geometries S, anywhere on 
the surface of a sphere of radius With a linear array of atoms oriented along a, 
this atomic transform now can be measured only where this sphere is intersected 
by a set of parallel planes with a spacing of I /a (Fig. 13-8c). One describes this by 
saying that the originally broad atomic or molecular structure factor now is sampled 
at discrete places. Figure 13-6 shows an additional example of this sampling (see 
also Box 13-4). The orientation and spacing of the scattering planes contain all the 
information about the array and no information about the atom or molecule. The 
actual value (amplitude and phase) of the structure factor at these sampling positions 
retains information about the structure of the atom or molecule. 

Note that all of the conditions restricting the observation of scattering have 
been plotted in Figures 13-7 and 13-8 in terms of the vector S. The dimensions of S 
are reciprocal distance, and so the coordinate system shown in these figures is reci- 
procal space. Increasing the distance between atoms of an array (in real space) will 
result in decreasing the spacing imposed by the von Laue conditions on the parallel 
planes (in reciprocal space). 

A fixed orientation of a and s 0 allows only a restricted region of reciprocal space 
to be sampled in an x-ray scattering experiment. This region can be enlarged by 
changing the angle between a and s 0 , by rotating either the sample or the angle of 
incidence of the x rays. The largest possible value of \S\ for any geometry is 2/A (Fig. 
13-3c). Therefore, the maximal region of reciprocal space that can be sampled, after 
all orientations of a and s 0 have been tried, is a sphere of radius 2// centered at the 
origin of reciprocal space. This sphere is called the limiting sphere (see Fig. 13-23b). 



X-ray scattering actually observed in the laboratory frame 

The scattering vector S is very convenient for mathematics, but it tends to obscure 
what is happening in an experiment. Therefore, Figure 13-9 shows the diffraction 
from a linear array of identical scatterers in the laboratory frame. Assume that the 
sample is placed at the center of a cylinder of x-ray film (Fig. 1 3-9a). X rays are incident 
along a fixed direction, and all of the scattered intensity is detected by the film. Dif- 
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Figure 13-9 

X-ray scattering from a one-dimensional array as seen in the laboratory, (a) X rays incident along the r 
axis strike a sample at the origin, and scattered rays are detected by a cylindrical film, (b) Scattering 
pattern produced by a single atom. The pattern is elliptical because of the cylindrical film; the pattern 
would be circular if flat film were used, (c) The linear array, (d) The scattered radiation allowed by 
the von Laue conditions, viewed in the x-r plane, (e) The scattered radiation allowed by the von Laue 
conditions, viewed in the y-z plane, (f) Cones of scattered radiation produced by the von Laue 
conditions, for the particular geometry shown in part a. All scattered rays will extend along the surface 
of one of the cones, (g) Diffraction pattern of the array, resulting from the intersection of the scattering 
cones with the cylindrical film, (h) The actual scattering seen is the product of the atomic scattering 
shown in part b with the diffraction pattern of part g. (i) An alternative scattering geometry with the 
array parallel to the direction of incident radiation, (j) The scattered rays allowed by the von Laue 
conditions in the geometry of part i. (k) The array diffraction pattern resulting from the geometry of 
part i. (1) The product of the diffraction pattern of part k with the atomic scattering of part b. 
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ferent vectors S now correspond to different scattering angles 20. Consider each 
scatterer just as a single atom. Then, if the sample had only a single atom at the origin, 
the scattering would be given by Equation 13-23 (Fig. 13-9b). 

The effect of the linear array is to allow finite intensity only at scattering geo- 
metries corresponding to the intersection of the set of planes (a * S = lu for h = 0, 
+ 1, ±2, . . .) with the sphere of reflection. For each scattering vector S drawn to 
one of these points of intersection, there corresponds a ray of scattered radiation. 
From Equation 13-3, this ray propagates in the direction s = AS + s 0 (Fig. 13-3). 

To compute the pattern of scattered radiation, it is easiest to use the description 
of scattering shown in Figure 13-3b rather than the equivalent one shown in Figure 
13-3a. The vector s is placed along the s 0 axis a distance 1/A from the origin. In other 
words, s can be shown as emanating from the center of the sphere of reflection. When 
drawn in this manner, s points toward the tip of the corresponding scattering vector 
S (Figs. 13-3b and 13-9e). 

The vectors a and s 0 are fixed by the choice of orientation of the sample and 
the incident beam. Because of the von Laue constraint, each scattering vector S 
extends from the origin to one of the planes spaced \ ia apart. Figure 13-9d,e shows 
two cross sections through the origin. In the plane defined by S * a = 0, a continuous 
set of scattering vectors S is allowed. This leads to a continuous distribution of 
scattered radiation, which emanates from the sample in a circle parallel to the S • a = 0 
plane (Fig. 13-9f). 

In the plane parallel to a, only certain values of S are allowed. Thus, vectors 
describing scattered radiation appear only at certain deflection angles a (Figs. 13-9d). 
Elementary geometrical considerations indicate that sin a = /i/./a, where h is any 
integer such that \h\ < a/ a. Each value of h leads to a cone of scattering (Fig. 13-9f). 
Where this cone intersects the cylindrical film, a ring of scattered intensity results. 
When the film is unrolled, this ring becomes a line, called a layer line. The various 
layer lines are all parallel, and their spacing increases progressively as \h\ increases. 
The lines are perpendicular to the linear array (Fig. 13-9g). 

The film records scattering from the individual atom of Figure 13-9b only in 
the lines allowed by the array (Fig. 13-9g). This effect leads to the pattern of scattering 
shown in Figure 13-9h. Note that only a finite number of layer lines are seen because 
of the conditions imposed by the sphere of reflection. There is a maximal value of 1 
for sin a. For the geometry shown in Figure 13-9c, this value will occur when s is 
parallel to a. Here the largest value of h that can be included within the sphere of 
reflection is ±|a|//.. For example, suppose |a| is 5 A, and /„ is 1 A. Then the integer h 
can have values only in the range — 5 ^ h ^ +5; the diffraction pattern will have 
1 1 layer lines. 

Figure 13-9h shows that the intensities drop off as h increases. This occurs 
because the atomic scattering decreases as e" s2 . As demonstrated in Figure 13-9d,e, 
large values of h tend to correspond to large values of |S|. The scattered intensity also 
drops off rapidly on each layer line as one moves from the center to the edges. This 
is because, along each line of scattering, larger angles inevitably correspond to longer 
lengths of the scattering vector, |S| (see Fig. 13-9e). 

The appearance of the scattering pattern depends markedly on the relative 
orientation of the incident beam (s 0 ) and the repeating array (a). For example, if the 
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array is rotated so that a and s 0 are parallel (Fig. 13-9i), the scattering pattern changes 
from a series of lines to a series of concentric curves (Fig. 13-9k). These curves are 
elliptical because the intersection of a circle (the cone of scattered radiation) and a 
cylindrical surface with its long axis parallel to the plane of the circle (the film) is an 
ellipse. 

X-ray scattering from a two-dimensional array of atoms 

The results shown in Figures 13-8 and 13-9 are the basic ideas behind all x-ray crystal- 
lographic measurements. However, if they are to be useful in practice, one must 
extend them from a one-dimensional array to a three-dimensional crystal. First, 
consider a two-dimensional net of molecules (Fig. 13-10a). The periodicity of this 
net is defined by two vectors, a and b. In general, a and b are not perpendicular to 
one another, nor are they of the same length. The periodicity along a will cause a set 
of scattering fringes at S * a = h 9 exactly as for the one-dimensional array in Figure 
13-8. The additional periodicity along b causes a comparable set of fringes defined 
by S • b = k 9 where k is any integer. By the arguments used earlier, you can see that 
these latter fringes are parallel planes perpendicular to the vector b and spaced by 
equal increments l/|b| (Fig. 13-10b). 

The x-ray scattering will have finite intensity only where both von Laue con- 
ditions (S • a = h, and S • b = k) are satisfied. This condition is met at the intersection 
of the two sets of planar fringes. That intersection is an array of parallel lines (Fig. 
1 3-1 Oc). The lines all are perpendicular to the plane defined by a and b. Experimentally, 




(a) (b) (c) 

Figure 13-10 

X -ray scattering from a two-dimensional array, (a) The array defined by a and b. (b) Parallel planes 
demonstrate the allowed positions of the scattering vector S. Shown in cross section are planes spaced 
by \;a perpendicular to a. and planes spaced by l ib perpendicular to b. (c) The array of lines resulting 
from the intersection of the parallel planes shown in part b. Scattering vectors extending from the origin 
to any point along one of these lines will yield observable x-ray scattering intensity. 
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scattering intensity will be observed whenever the geometry of incident and diffracted 
radiation leads to a scattering vector that extends from the origin to a point along 
one of the arrays of parallel lines. 

The greater restriction of scattering caused by a two-dimensional array means 
that more of the intensity will be concentrated at a smaller number of sets of scattering 
angles, 20. The effect of a two-dimensional array on the actual experimental scattering 
pattern is shown schematically in Figure 13-11. The plane of the array is perpendicular 
to the direction of incident x rays. If the array is considered as the effect of two 
perpendicular one-dimensional arrays, each alone would produce a set of scattering 
fringes. The x-ray diffraction pattern is the product of the two sets of fringes. In this 
case, the scattered radiation detected by the x-ray film consists of a series of spots, 
each occurring at the intersection of two fringes. (See Box 13-4 for examples of the 
optical diffraction of two-dimensional arrays.) 

X-ray diffraction from a three-dimensional array of atoms 

It is easy mathematically to generalize x-ray scattering to three dimensions, but it is not 
so easy to visualize the results. For an array such as that found in a real three-dimen- 
sional crystal, there is now a third periodicity defined by the vector c (Fig. 13- 12a). 
This leads to a third set of planar scattering fringes given by S ■ c = /, where / = 0, 
+ 1, ±2, ; . . . This set of fringes intersects the parallel lines generated by S ■ a = h 
and S • b = k. The result is a three-dimensional lattice of points, spaced evenly by 
l/|a| in the direction perpendicular to a, by l/|b| perpendicular to b, and by l/|c| 
perpendicular to c (Fig. 13- 12b). Diffracted radiation will be observed only when the 
scattering vector S intersects one of the lattice points. It is not easy to illustrate the 
actual diffraction pattern from a three-dimensional crystal because it is a three- 
dimensional pattern. 

Note that the lattice that describes allowed scattering geometries is not the 
same as the lattice of points that represents the positions of the atoms in the array. 
The position lattice has spacings a, b, and c, whereas the spacings in the diffraction 
lattice, a*, b*, and c*, are related to the inverse of these/This scattering lattice is 
called the reciprocal lattice. The vector space it occupies is called the reciprocal space 
(Fig. 13-1 2c). 

The array (crystal) we have described thus far contains only a single atom per 
repeating unit. The positions of the atoms define a set of cells bounded by a, b, and c 



Figure 13-11 

X-ray scattering observed in the laboratory for a two-dimensional array perpendicular to the direction of 
incident radiation. Each part shows patterns for both cylindrical and spherical film, (a) Sample 
geometries, (b) Layer lines resulting from the a periodicity of the array, (c) Layer lines resulting from 
the b periodicity of the array, (d) Actual scattering observed is the product of the functions shown in 
parts b and c with the atomic scattering pattern shown in Figure !3-9b. 
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Figure 13-12 

Three-dimensional arrays, (a) Vectors a, b, and c define the array, (b) Scattering planes resulting from 
each one-dimensional periodicity intersect to give lines for each two-dimensional periodicity and points 
for each three-dimensional periodicity. The array of points that results is the reciprocal lattice, 
(c) The reciprocal lattice is generated by the vectors a*, b*, and c*. Several cells of the reciprocal lattice 
are shown. Scattering vectors extending from the origin to these puints result in observed intensity. 
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(Fig. 13- 12a). It is convenient to express the position of the atoms in terms of a co- 
ordinate system defined by these vectors. A vector drawn from the origin to the ;th 
atom position is r = xa + yb + zc. Because the atoms lie at the corners of the cells, 
x, y, and z must be integers. 

The vector r can be used to calculate the x-ray scattering from the array. From 
Equation 13-27, the structure factor is a sum over all atom positions: 

FjJS) = Z Z lf(S)e 2 « iS ' <» w> (13-36) 

x y z 

Inserting the von Laue conditions (S • a = h, etc.), we can rewrite this as 



FyJKkJ) ^ZZZ/W^^^' 2 ) 



(13-37) 



where h, K and / are any integers. Every diffracted ray can be computed by choosing 
the appropriate integral values of /z, k, and / and summing over the array. Note that, 
for an array of identical atoms, each exponential term in Equation 13-37 is simply 
unity, because K K h and z are all integers. Therefore, Equation 13-37 becomes 



F^(KKl) = Nf(S) 



(13-38) 



where N is the number of atoms in the array, and f(S) is the atomic scattering factor, 
now evaluated only at the particular values of S allowed by integral choices of h 9 
k, and /. Thus the x-ray scattering is just the single-atom scattering sampled at all 
points in reciprocal space allowed by the von Laue conditions imposed by the lattice. 

Equation 13-37 is a Fourier series rather than a Fourier transform because of 
the discrete nature of the diffraction pattern. However, it can be inverted in exactly 
the same way as a transform. By analogy to Equation 13-8, the electron density 
distribution of the array will be given by 



p{x,y,z) = il/NV) Z Z Z F lot (KkJ)e- 2 ^ hx+k ^ 

*?= — oo k=-oo i = - oc 



(13-39) 



where V is the volume of one unit cell, V = a * b x c, so that NV is the volume of 
the entire array. The presence of the volume factor V in Equation 13-39 is easy to 
rationalize. F(h, k, /) is proportional to the number of electrons, whereas p (an electron 
density) has units of electrons per unit volume. 
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X-ray diffraction from a three-dimensional molecular crystal 

Although Equations 13-37, 13-38, and 13-39 were derived for an array with only 
one atom at the corner of each cell, it can be shown that similar equations hold for 
any real crystal. The repeating element of a crystal is defined as a unit cell. The crystal 
is a lattice of unit cells, each defined by the vectors a. b, and c (Fig. 13-1 2a). Whatever 
is inside each cell— whether a single atom, a molecule, or many molecules— the 
pattern of electron density repeats periodically throughout the crystal by translation 
along the vectors a, b, and c. Therefore, the von Laue conditions still apply and 
restrict the sampling of the structure factor to only those points defined by the reci- 
procal lattice. Now, however, what is sampled is not the atomic scattering factor. It 
is instead a molecular structure factor, or unit cell structure factor, given by Equation 
13-27. In order to show this, it is useful to employ a mathematical device known as 
a convolution. 



A repeating structure as a convolution 

X-ray diffraction examines both the properties of the crystal lattice and the structure 
of individual molecules. The distribution of electron density within each unit cell is 
identical. The lattice describes how this distribution is replicated into a three-dimen- 
sional pattern. Such a repeating structure can be very conveniently described as a 
convolution. 

Consider two arbitrary one-dimensional functions, f{x) and g(x). The functions 
/ and g both are defined along the x axis. Their convolution product is defined as 



where the variable u can take on any value that x can. It is equivalent to x, except that 
it is held constant in the integral. 

We will try to develop a physical picture of the convolution product. See the 
example shown in Figure 13-1 3a. The function g(u) plotted along the u axis is identical 
to a plot of g{x) along the x axis. The function g(u - x) plotted along the u axis is just 
the function g{u) shifted in space by an amount x. Therefore, in the convolution 
product, the function g is placed successively at all points along the u axis, but it is 
multiplied by a weighting factor, f(x), for each shift in position x. All the weighted 
values of g are added or integrated to produce the final convolution. 

Physically, the convolution Jg{u) means that one is laying down successive images 
of g weighted by f) It turns out to be equivalent to say that one is laying down images 
of/ weighted by g. To see this, let x' = u - x in Equation 13-40. Then dx' = -dx, and 



5 If this is not absolutely clear, stop right here. Study Figure 13-13a: try calculating a convolution 
product for the simple functions of your choice. 




(13-40) 
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Figure 13-13 

Convolution integrals, (a) Two functions / and g and their convolution calculated 
by Equation 13-40. (b) Two functions — a tree. /, and a lattice (set of delta functions). 
g — and their convolution. 



(taking note of the reversal of the limits of integration) 

= " JJT dx ' f{u " X 'M X '> = & {u) (13 " 41) 

because x and x' are just dummy variables. 

Suppose the function g is the Dirac delta function, 5(x — a). Then the convolution 

= JT X WW d l u - (x - of] = f(u + a) (13-42) 
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just shifts the function the distance a along the u coordinate system. In three dimen- 
sions, all the same results hold. The convolution product is 



5(0) = pr/(r)<?(ii - r) 



(13-43) 



If g is the three-dimensional delta function, d(r - p\ the convolution integral defined 
in Equation 13-43 merely shifts the function f(r) along the vector p in the u coordinate 
system. 

Delta functions can be used to describe a lattice. It is sufficient to define the 
origin of each unit cell. In one dimension, the origin can be any integral multiple of 
the vector a. This restricts values of x that define the origin to na. A function that 
describes one of these values is <5(x - na). Thus, an infinite one-dimensional lattice 
is given by the function 

L(x)= X &(x-na) (13-44) 

n = — oc 

This function really is just a list of all the lattice points. 

Suppose that the electron density distribution within one cell of the lattice is p(x). 
Then to describe the crystal, we want to replicate this electron density in each unit cell. 
From the properties of the convolution integral described above, this is done by 

Crystal = Lp(u) = f °° dxp(x) £ <5[u - (x - na)] = £ p{u + na) (13-45) 

n = - oo n- — qo 

The convolution Lp simply lays down an image of the structure in each unit cell 
(Fig. 13-13b). 

In three dimensions, an infinite lattice is described by 

L(r)= £ll 5(r — na — mb — pc) (13-46) 

n~ — oo m= — oo p=— oo 

where n, m, and p are any integers. The electron density within one cell is p(r), and 
the crystal is described by 



:u) = 

n = — oo m — — oo p~ — oo 



Crystal = Lp(u) = £ £ Z p(u + wa + mb + pc) (13-47) 



Thus any crystal can be described as the convolution of the contents of one unit cell 
with the lattice. 
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The Fourier transform of a convolution 

There is a property of convolutions that makes them especially useful for describing 
x-ray scattering. Suppose that f{r) and g(r) are functions that can be expressed as the 
Fourier transforms of the functions F(S) and G(S): 



/(r)= f°° dSF(S)e- 2 " iS r 

J — 00 

0 (r) = J_VSG(S)e- 2) " s ' 
and, consequently, F and G are inverse transforms of / and g: 

G(S) = ^drgtfe 2 **-' 
Then the convolution product of the two functions can be written as 
Mu) = j* x d r f{r)g(u-r) 

= f" dr f°° ^SF(S)e- 2 " iS " f°° </S'G(SV 

J — x J — ac J — oc 

Rearranging the order of the integrals, we obtain 

jS(u)=r° fiF(S)f" rfS'G(S>- 2 "' s ' n f" dre 2 ^ s '- S) f 

J — CO J — 00 J — 00 



(13-48a) 
(13-48b) 



(13-49a) 
(13-49b) 



,-2jti'S' • (u-r) 



(13-50) 



(13-51) 



The third integral is just the Dirac delta function, <5(S' — S) (see Box 13-3). Therefore, 
the result of the second integral is just to set S' = S, and Equation 13-51 becomes 



jg(u) = f °° dSF(S)G(S)e 

J — cc 



-2?tiS * u 



(13-52) 



Note that Equation 13-52 is just the Fourier transform of the product of the two 
functions F(S) and G(S). 

This is an important conclusion. The Fourier transform of the product of two 
functions, F and G, is the convolution product of their two Fourier transforms, / and g. 
We can derive a second important result by Fourier-transforming Equation 13-52: 



J x x du e 2niS ' * »Mu) = J* x du dSF(S)G{S)e 



- 27tiS • u^2jtiS' • u 
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= dSF(S)G(S) J_°° x du e 2niiS ' ~ S) U (1 3-53) 

However, the second integral is just the Dirac delta function, <5(S' - S). Thus, 



J_ a> oo du e 2 " iS '#u) = F(S')G(S') 



(13-54) 



The Fourier transform of a convolution product is just the product of the Fourier 
transforms of the two convoluted functions. 



Convolutions in the computation of x-ray scattering 

Consider a one-dimensional molecular crystal. Within each unit cell, the electron 
density distribution is p m (r). The x-ray scattering from the contents of a single cell 
located at the origin is given by Equation 13-7: 

F m (S) = j° a dTp m (r)e 2 °*' (13-55) 

The lattice is generated by the vector a, and the lattice can be described by Equation 
13-44 expressed in three dimensions: 8 

L(r)= £ <5(r-na) (13-56) 

n— - oo 

The crystal is generated by the convolution 

pjL(o)= f_">P m (r) I <5[u - (r - na)] (13-57) 

fl= - 00 

The x-ray scattering from the crystal, using Equation 13-7, is 

F Tot(S ) = dre 2 ** • u p?L(u) (13-58) 

This expression can be evaluated by using Equations 13-54 and 13-49, and changing 
from the variable u to the equivalent variable r: 

f Tc(S) = (/_"„ drp m {r)e™ s dr ^ 5(r - m,)^') (13-59) 

5 Even with a one-dimensional molecular crystal, one must formally consider a three-dimensional 
diffraction pattern, because the relative orientation of the molecule within the lattice will affect the 
scattering. 
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The first term is just the structure factor of a single unit cell. F m (S) (Eqn. 13-55). The 
second term is the sampling function generated by the lattice, F L (S). It is evaluated 
simply by using the properties of the delta function. The result is 



F t«(S) = F m (S)F L (S) = FJS) X * 



2ninS • a 



(13-60) 



This result is identical in form with Equation 13-31. However, it is more general 
because it holds for molecular crystals. This example shows the correctness and 
simplicity of the convolution approach. However, its real advantage is the physical 
insight that can be gained once one is used to it. 

Equations 13-59 and 13-60 mean that, to calculate x-ray scattering, one can sim- 
ply multiply the scattering expected from one unit cell by the sampling function gener- 
ated by the lattice. 

Another example of this approach will demonstrate its usefulness. Consider an 
infinite one-dimensional crystal with a cell length 2a and two identical atoms per cell, 
one at the vertex and one halfway between adjacent vertices: 
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Let us calculate the x-ray scattering expected for such a crystal. From Equation 13-59, 
we evaluate the unit-cell structure factor as 



^(S) = /_* x drp m ( r )e 2 « iS ■ ' = /(S)(l + e 2 " iS • a ) 



(13-61) 



where we have used the definition of the atomic scattering factor. One must consider 
only two atoms— one at a vertex (say, the origin), and one at the center of the cell— 
because the atom spaced 2a away from the origin will be counted as part of the next 
unit cell, and the atom - a from the origin is counted as part of the preceding unit cell. 
The lattice shown leads to a sampling function 



F L (S) = J*^ dr £ <5(r - 2m)e 2KiS ' r = £ e * ninfi ' 1 

n=-oc n = - x 

Thus, the x-r-ay-s-eatter^ 



(13-62) 



f T o«(S) = /(S)(l +e 2 * iS '*) J] e 4 ™ 



S - a 



= f(S) £ { e *ni"S * a + ^4«i(n + 1 , 2)S • aj 



(13-63) 
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By writing out the sum in the final expression term by term, we can easily show 
it to be equal to 

F loX (S)=f(S) I e 2 " inS " (13-64) 

n= - oo 

This is identical to Equation 13-31, illustrating the important result that the scattering 
calculated for a crystal does not depend on how we choose to define the unit cell. 

Calculation of x-ray scattering from a molecular crystal 
using convolutions 

To treat a real crystal, one must extend Equations 13-55 and 13-56 to three-dimen- 
sional arrays. The crystal is generated by the convolution p m L, where L(r) is given by 
Equation 13-46. Evaluating this (exactly in the way it was done in Equations 13-59 
and 13-60) yields the structure factor of the crystal: 

F Tol (S) = F W (S) f t t ^(nS.e^S.b^S-c) 

n = - 00 m = — 00 p=-oo 

The triple sum in Equation 13-65 is the three-dimensional sampling function 
generated by the lattice. It limits the detection of scattered intensity to geometries 
allowed by the von Laue conditions. Applying these conditions, we can evaluate 
S • a = h, and S • b = K and S • c = / to obtain 

F Toi (Kk,l) = F m {S) £ll e 2 ***-*** (13-66) 

n=-oo m= - cop= — oo 

Now every exponential term is simply unity, and the triple sum simplifies to 



F 7ot {h,k,l) = NF m (S) 



(13-67) 



where N is the number of unit cells in the crystal. 

It is convenient to write out the unit-cell structure factor, F m {S), explicitly in 
terms of the positions of each atom in the unit cell, and of the corresponding atomic 
scattering factors. Using Equation 13-27 for the molecular structure factor, we choose 
a coordinate system based on the unit-cell vectors a, b, and c. The position of thej'th 
atom in the unit cell is then 



(13-68) 



13-2 X-RAY DIFFRACTION 



727 



where \ j9 y j9 and z } are now fractions of the corresponding unit-cell dimensions. Then 
Equation 13-27 becomes 

FJS) = X fj(S)e 2niiXjS ' B+>vS b+ZjS ' c> (13-69) 
j 

where the sum is taken over all the atoms in one unit cell. However, FJS) can be 
sampled only at geometries allowed by the von Laue conditions. When we apply 
these, equation 13-69 simplifies to 



(13-70) 



This equation is called the structure factor equation. It represents the unit-cell x-ray 
scattering sampled at the reciprocal lattice points, h 9 k, and /. 

Equation 13-70 is one of the key results in x-ray crystallography. It provides a 
direct way to calculate the x-ray diffraction of a crystal, provided that the structure 
of one unit cell is known. Alternatively, if the structure factor FJh, /c, /) is known, the 
electron density distribution of the crystal can be calculated. The equation used is 
identical to Equation 13-39. However, instead of using Equation 13-38 to describe 
the unit-cell contribution, one must use Equations 13-67 and 13-70. 



9 Bragg's law of diffraction 

Most elementary treatments of x-ray diffraction discuss the process as the reflection 
of x rays from certain planes in the crystal lattice. Because this is probably the for- 
malism many readers have seen previously, it is worthwhile to show how our present 
treatment is equivalent. Lattice points are defined as the corners (vertices) of the unit 
cells. Lattice planes are a set of equidistant parallel planes constructed so that all 
lattice points lie on some member of the set. Clearly, the planes passing through the 
faces of the unit cells of the lattice are one such set of planes. Planes passing through 
opposite vertices of the unit cells also are a set of lattice planes (Fig, 13-14a). These 
cut the axes (a, b, and c) precisely at the values corresponding to one unit translation 
of the lattice. However, increasingly finer-spaced lattice planes can be drawn that 
cut the a axis at any a/2, a/3, . . . , a/h of the unit translation (Fig. 13-14b,c). 

In thr^dTmeFs^ planes exist that cut one axis every a//i while cutting 

another at b/k and the third at c/l The planes can be described by specifying the Miller 
indices, h, k y and /. Note that the spacing (d) between adjacent planes is inversely 
related to the size of the indices of the planes. Therefore, it is reasonable that the 
planes could bear some relationship to the reciprocal lattice. 





(a) (1, 1) 



(b) (3, 1) 




(0 (1, 2) 



Figure 13-14 

Three sets of lattice planes. Shown below each set are the Miller 
indices (/i. k) that describe it. 



In the Bragg's-law description of diffraction, an x-ray beam that impinges on 
a lattice plane at an angle 9 is described as being reflected from that plane at an equal 
angle (Fig. 13-15a). This corresponds to a scattering angle 26. The Bragg conditions 
for observing diffraction require that the path difference between reflected beams 
from adjacent lattice planes be an integral number of wavelengths. From Figure 
13- 15a, we see that this condition clearly is met wherever 



2d sin 6 — n/. 



(13-71) 



where n is any integer, and d is the distance between two adjacent lattice planes. 

To compare the Bragg treatment with our previous description, it is necessary 
to show how the scattering vector S is related to a lattice plane. Consider a lattice 
plane that intersects the three axes of a unit cell at a/h, b/k, and c/l (Fig. 13-15b). Let 
r be a vector drawn from the origin to any point in this plane. Consider the properties 
of a scattering vector S that happens to satisfy the equation S • r = 1. For a fixed 
direction of S, the relation S • r = 1 defines a plane perpendicular to S, because it 
simply means that the projection of r on S is a constant. 

As we showed previously, not all values of S lead to detectable scattering. Only 
those values that satisfy the von Laue conditions are acceptable. These conditions are 
S • a = h, with equivalent equations for b and c. We can put the von Laue conditions 
in the form 



S-a//i = S-b//c = S-c//= 1 (13-72) 

In other words, a//?, b/k, and c/l are all values of r that satisfy the condition S ■ r = 1. 
These three values uniquely define a plane (Fig. 13-1 5b). This plane is a lattice plane 
(as described in Fig. 13-14), but it is also a plane containing values of S that lead to 
x-ray diffraction. 



Incident 




a axis 



S 



(a) 



(b) 



Figure 13-15 

Derivation of Bragg s law. (a) Diffraction viewed as reflection of x rays from adjacent lattice planes, 
(b) The lattice plane {h.k, /) is a plane containing vectors that satisfy the von Laue scattering conditions. 



The lattice plane adjacent to the plane defined by S ■ r = 1 will pass through the 
origin. The spacing d between these two planes is the length of a vector r 0 parallel to 
S (Fig. 13-15b). The condition S • r 0 = 1 means that d = |ro| = 1/|S|. However, we 
showed earlier that |S| = 2|sin B\/L Therefore, the scattering angle (29) produced by 
crystal planes separated by a spacing d, is given by 



This is identical to Equation 13-71 with n = 1. Thus, the equivalence of the Bragg 
treatment and the von Laue conditions has been illustrated. (To derive the full Bragg 
equation, consider the properties of the plane defined by S • r = n where n is an 
integer.) 

13-3 PROPERTIES OF CRYSTALS 

A crystal is a three-dimensional ordered array of molecules. From the discussion of 
x-ray scattering in the previous section, it is clear that crystals are not a requirement 
for x-ray diffraction measurements. Any ordered (or partially ordered) array of 
molecules can, in principle, produce useful x-ray data. However, it is evident that 
crystals are the most favorable samples. A large ordered array leads to sharp diffrac- 
tion spots, which concentrate the scattered intensity in small discrete regions of scat- 
tering angle (20). This greatly facilitates the acquisition of reliable intensity data. 

If the sample is not a perfectly ordered crystal, intensity can reach observable 
levels over wider ranges of scattering angles. The diffraction pattern can smear into 
rings or streaks, and therefore considerable imprecision is introduced in assigning 



sin 6 = k/ld 



(13-73) 
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values of 0 observed. In general, only if the sample has three-dimensional order will 
the diffraction pattern contain all the information needed to reconstruct the three- 
dimensional structure. Disorder corresponds to averaging over orientations of both 
the lattice and the molecules it contains. The resulting data then contains only 
information about the averaged structure. 

Restrictions on possible crystal lattices 

A crystal is essentially a three-dimensional mosaic. The unit cell defined by the vectors 
a, b, and c contains the fundamental repeating unit. The crystal is generated by suc- 
cessive translations of the unit cell along the axes a, b, and c; in just the same way, 
a mosaic is built up by placing down multiple copies of the same unit structure. It is 
a fundamental consequence of geometry that three-dimensional space can be filled 
only by mosaics of cells of certain shapes. There are, in fact, only seven fundamental 
types of unit cells. Each defines a crystal system (Fig. 13-16; Table 13-1). 

Each unit cell consists of a motif that is the actual unit repeated throughout the 
crystal by the lattice translations. The crystal is a convolution of the motif and the 
lattice (Fig. 13-1 7a). A motif can be a single atom or molecule, or it can contain more 
than one molecule. 

The simplest possible crystals would have one motif positioned with the same 
orientation at each corner of the unit cells. There are eight corners and each is shared 
by eight unit cells. Therefore, there is one motif per unit cell. Such lattices are called 
primitive, and they are denoted by the letter P. It is always possible to choose a 
primitive triclinic cell for any lattice. This is the least-symmetric unit cell. Each dimen- 
sion and each angle are different. So it takes six parameters to specify such a cell. 

There are many cases in which the symmetry of the lattice can be increased if 
a larger unit cell containing additional lattice points located on the faces or at the 
center is chosen. These nonprimitive lattices have more than one copy of the motif 
per unit cell. By choosing a nonprimitive lattice, one often can describe the crystal 
with fewer parameters. There are a total of seven nonprimitive lattices (Fig. 13-16). 
They are designated / for cells having an extra lattice point at the center, C for cells 
with two extra lattice points on one pair of opposite faces, and F for cells with extra 
lattice points on all faces. You should be able to convince yourself that C and / 
lattices have two motifs per unit cell, whereas F lattices have four. 

It is important to recognize that the choice of lattice is not always unique. Figure 
13-18 shows a few examples of alternative choices. There are certain conventions 
that help to decide which lattice to use, but we need not be concerned with them 
here and, in any event, they are not always rigidly adhered to. For each type of lattice, 
there are only certain arrangements of molecules or motifs that can be inserted without 
reducing the lattice to one of greater symmetry or one with a smaller unit cell. 

The choice of lattice can simplify the analysis of x-ray scattering data. However, 
it is important to reiterate the point made earlier that the x-ray scattering calculated 
for a crystal of known structure is independent of the choice of lattice. 
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Figure 13-16 

The fourteen Bravais lattices. For a list of their properties, see Table 13-1. 
[After G. H. Stout and L. M. Jensen, X-Ray Structure Determination 
(New York: Macmillan, 1968).] 
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Figure 13-17 

Motifs, lattices, and symmetry operations, (a) A lattice and simple motif (a single hand). The crystal is 
the convolution of the motif and the lattice, (b) Two motifs containing symmetry-related structures. 
The two hands on the left are related by a twofold rotation (C 2 ) axis. The two hands on the right are 
related by a 2 X screw axis. In each structure, the motif consists of two hands. The asymmetric unit is 
just a single hand, (c) Arrays generated by screw axes. From left to right, the axes are 2 U 4„ and 4 2 ; 
the resulting unit cells have 2, 4, and 4 asymmetric units, respectively. [Drawings by Irving Geis.] 
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(a) 




C 



(b) 




120 



(c) 

Figure 13-18 

Choosing different unit cells for the same lattice, (a) Choice of C or 1 unit cells in a monoclinic lattice, 
(b) Choice of C or P unit cells in a tetragonal lattice, (c) Choice of an orthorhombic C lattice or a 
hexagonal P lattice. [After G. H. Stout and L. M Jensen, X-Ray Structure Determination (New York: 
Macmillan, 1968).] 



Symmetry properties of molecules and crystals 

The overall symmetry of the crystal is called the space group. It is described by 
naming the type of unit cell and any symmetry relationships within the molecules 
that make up the motif. For arbitrary structures, it turns out that there are precisely 
230 possible space groups. These contain two types of symmetry : point symmetry 
and space symmetry. 

Point-symmetry operations consists of manipulations of an isolated object that 
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leave at least one point in space unchanged (Box 2-3). These can include 

1. rotation axes, named by a number (2 for twofold axis, 3 for threefold axis, and 
so on); 

2. mirror planes, designated by m; 

3. rotation coupled with reflection (for example, combining a twofold rotation 
axis with a mirror plane perpendicular to it, resulting in inversion of an object 
through an origin located at the intersection of the rotation axis and the 
mirror plane; this operation is designated 2/m); 

4. rotation-inversion axes, designated by a number with an overbar (for example, 
4 indicates that each rotation of 90 c is accompanied by inversion through 
the origin). 

A point group is a list of all of the point-symmetry relationships possessed by an 
object. The object can be a molecule, a set of molecules, or an entire crystal. Several 
of the point groups possible for molecules consisting of multiple copies of identical 
subunits are illustrated in Chapter 2. 

Space-symmetry operations involve translation of the object. These include 
screw axes (which are a rotation accompanied by translation) and glide planes (which 
are translations accompanied by reflection). Screw axes are called n m , where n is the 
rotation axis, and m/n is the fraction of a unit cell along which the translation occurs. 
For example, 3 X indicates a rotation of 120 c accompanied by a translation 1/3 of the 
unit-cell length. The description of glide planes is complicated, because it depends 
on which face or diagonal the glide is along, as well as on how far the glide occurs. 
Figure 13-1 7c shows a few examples of motif-symmetry operations. 

The presence of particular symmetry elements in the motif restricts the possible 
type of unit cell. For example, if a twofold rotation axis is present in the space group, 
this axis must be perpendicular to two unit-cell vectors. Otherwise, this symmetry 
operation would leave the motif unaltered internally, but would change its location 
within the unit cell. The presence of threefold or higher rotation axes requires that 
the two unit-cell vectors perpendicular to the axis must be equal in length. 

Space groups available to biological molecules 

The allowed combinations of the point and space symmetry possessed by the motif 
generate the 230 space groups. It is convenient to introduce the concept of an asym- 
metric unit. This is the smallest unit from .which the crystal structure can be generated 
by making use of the symmetry operations of the space group. The asymmetric unit 
can be several molecules, one molecule, or a subunit of an oligomeric molecule. The 
crystal is generated, first by creating the motif by the space-group symmetry opera- 
tions on the asymmetric unit, and then by translation of the motif through the lattice. 



736 



X-RAY CRYSTALLOGRAPHY 



The number of asymmetric units per unit cell, n\ is determined by the space group. 

For biological molecules, the motifs inevitably contain asymmetric carbon 
atoms. Therefore, the symmetry arrangement of the molecules can never contain 
mirror planes, glide planes, centers of symmetry, or rotation inversion axes. Only 65 
of the 230 space groups can apply to biological molecules. The biologically relevant 
space groups can contain 1, 2, 3, 4, 6, 8, 12, 24, 48, or 96 asymmetric units per unit 
cell (Table 13-1). 

A practicing crystallographer presumably will learn to picture many of these 
space groups. However, molecules seem to prefer to crystallize in only a limited 
number of space groups. For example, 80% of 1,200 organic compounds surveyed 
fell into the triclinic, monoclinic, or orthorhombic crystal classes, and half of these 
occurred in just three space groups. Figure 13-19 shows a few of the most commonly 
found of the space groups allowable for biological molecules. Note that these dif- 
ferent groups imply different numbers of molecules per unit cell. 




Figure 13-19 

A familiar asymmetric unit as it might appear in four different space groups. P\ has no motif symmetry; 
P2 X has a single 2, screw axis shown as a half-arrow; ?2{l x 2 has four screw axes (each 2,) and a C 2 
axis perpendicular to the plane of the page; C2 has two 2, screw axes and a C 2 rotational axis shown as 
the full arrow. [Drawings by Irving Geis.] 
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Determination of the dimensions of the crystal lattice 

X-ray diffraction occurs whenever the scattering vector coincides with a reciprocal 
lattice point, as we have shown. That means that the resulting diffraction pattern 
can be used to construct an image of the reciprocal lattice. From the spacing between 
diffraction spots as actually observed in the laboratory and a knowledge of the 
geometry of the diffraction experiment, one can compute the spacing between points 
on the reciprocal lattice. This in turn allows the geometry of the unit cell to be 
calculated. 

Here we shall demonstrate the determination of the spacing of a one-dimensional 
crystal. The crystal is placed in the center of a cylindrical film (Fig. 13-20). The von 
Laue conditions for the a crystal axis require that S * a = h. Consider the first two 
diffraction planes, which will occur at h = 0 and ft = 1. If we make measurements 
with the crystal oriented so that S is parallel to a, then the length of the scattering 
vector, S, is 0 for h = 0 and is l/a for h = 1. The scattering angle is computed from 
Equation 13-5; |S| = 2|sin0|//. For h = 0, we have sin0 = 0 and 0 = 0. For h = 1, 
we have sin 6 = /|S|/2 = /./2a. Therefore, the angle between the two scattering planes 
is 26 = 2 sin" 1 //2a. Thus, if 6 is measured experimentally, and if / is known, the 
distance a can be computed. 



X-ray 
source 




Figure 13-20 

Experimental scattering geometry. Shown are a sample at the origin, a section of the reciprocal lattice, 
one scattering vector S (and the scattered radiation associated with it), and two layer lines as they ! 
intersect the film. Below the reciprocal lattice, three atoms in the actual crystal lattice are illustrated to 
show the orientation of the sample. In this example, for clarity, we show values of /./2a and 6 much 
larger than the values typically encountered in actual experiments. 
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In a typical case, /. might be 1 A, and a might be 10 A. Therefore, sin 6 is 1 20. 
or 6 is about 3 C . A common x-ray camera would have film arranged in a cylinder 
28.65 mm in radius. Its circumference is In x 28.65 mm. The angle between the h = 0 
and h = 1 scattering planes is 26. This is 6 C , or 6/360 the circumference of the film. 
Therefore, the distance between the two scattering planes as they intersect the film 
is 2n x 28.65 x 6/360 ^ 3 mm. 

Note that, although the actual crystal spacings are very small, the film is placed 
far away from the sample. This magnifies the diffraction pattern until planes are 
physically separated by a distance convenient for measuring. In an unknown case, 
all one has to do is work the calculation backwards to obtain a. The actual equation 
is a = A/2 sin (360D/47tr), where D is the physically measured spacing on the x-ray 
film, and r is the radius of the camera. 

The relationship between the crystal lattice and the reciprocal lattice 

Real crystals are three-dimensional. The reciprocal lattice that one sees in an x-ray 
diffraction pattern also is three-dimensional. It is related in a simple way to the actual 
crystal lattice. By measuring the spatial pattern of diffracted spots, it is possible to 
compute the cell dimensions and shape of the reciprocal lattice. From this, the corres- 
ponding dimensions and shape of the unit cell of the actual crystal lattice can be 
derived. 

Figure 1 3-21a shows that each of the vectors a*, b*, and c* defining the reciprocal 
cell is located along lines formed by the intersection of two planes. For example, c* 
is formed by the intersection of planes generated by successive values of h for the 
von Laue condition a • S = h (and therefore these planes are perpendicular to a) and 
planes generated by b ■ S = k (and thus perpendicular to b). This means that c* must 
be perpendicular to both a and b, and we can write, in general, 

c* = raxb (13-74a) 
b* = qc x a (13-74b) 
a* = pb x c (13-74c) 

These constants control the magnitudes of the reciprocal-cell vectors. To deter- 
mine the constants, we must use the von Laue conditions that generate the reciprocal 
lattice. For example, the condition S • c = / generates a set of planes spaced by 1/c. 
The vector c* extends between-two such planes, although-it is not necessary normal 
to them (Fig. 13-21b). However, the projection of c* on c must be 1/c (Fig. 13-21b). 
Thus we can write 



c • c* = |c|(l/c) = 1 



(13-75) 



(a) 




Figure 13-21 

Geometrical properties of the reciprocal lattice, (a) Reciprocal-lattice vectors lie at the intersections of 
sets of parallel planes. For example, c* extends between two planes spaced l/|c| apart, but it is formed 
by the intersections of planes perpendicular to a (spaced l/|a| apart) and planes perpendicular to b 
(spaced l/|b| apart), (b) Demonstration that c* and c are not necessarily parallel. 

Similarly, one can show that a • a* = 1 and b • b* = 1. 

When Equation 13-74 isjnWted'M^ 
re a x b 5 from which we can evaluate r. Carrying out equivalent manipulations for 
a* and b* 5 we obtain 



r = l/(c • a x b) q = l/(b • c x a) p = l/(a - b x c) 



(13-76) 
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Using the properties of the triple scalar product (see Box 8-2), each of these quantities 
is equal to the volume of the parallelopiped formed by the three vectors a, b, and c. 
Thus, r = q = p = l/V, where V is the volume of the unit cell of the crystal. Using 
Equations 13-74, 13-75, and 13-76, we can construct the reciprocal cell if the actual 
unit cell of the crystal is known. Figure 13-22 shows two examples. 

In practice, observations of the geometric pattern of diffraction spots allow 
measurement of the reciprocal lattice vectors a*, b*, and c*. Then one must compute 
the unit-cell vectors. The procedures are quite similar to that just outlined. Note, 
for example, that b* and c* lie in a plane perpendicular to a (Fig. 13-21). Therefore, 
a = r'b* x c*, and similar equations exist for b and c. To determine r', one uses 
the constraint a * a* = l.Thenr' = l/(a* • b* x c*) = 1/F*, where K* is the volume of 
the reciprocal cell. Thus, the unit cell is constructed from the measured diffraction 
data by 



(13-77a) 
(13-77b) 
(13-77c) 



a = (l/K*)(b* x c*) 
b = (l/7*)(c* x a*) 
c = (l/7*)(a* x b*) 




Figure 13-22 

Comparisons of direct and reciprocal unit cells, (a) For an orthorhombic crystal, (b) For a triclinic 
crystal. [After G. H. Stout and L. M. Jensen, X-Ray Structure Determination (New York: Macmillan, 
1968).] 
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A necessary consequence of Equations 13-74 and 13-77 is that the volumes of unit 
cells and reciprocal cells are inverse (see Guinier, 1963, p. 88): 

F=l/K* (13-78) 

Determination of the space group 

In addition to determining the unit cell, it is useful to establish the space group of 
the crystal. This shows if there is any internal symmetry, and whether one can use 
this symmetry to reduce the portion of the structure that must be solved. There is 
no general way to find the space group but, for many biological molecules, one can 
sharply narrow the possibilities by simple examination of the intensity of diffraction 
spots in the reciprocal lattice. Particular crystal classes show symmetries in the 
diffraction pattern that correspond to symmetries in the space group. For example, 
a twofold rotation axis in the crystal leads to a mirror plane in the diffraction pattern 
intensities. 

Even more informative are systematic absences of intensity at certain points of 
the reciprocal lattice for many space groups. Consider a space group with a twofold 
screw axis along c. This axis rotates x to -x, rotates y to -y, and translates half of 
the unit-cell distance along c. Then, for each atom at r = x^a + yjb + ZjC, there must 
be an identical atom at r' = -x,a - yp + (Zj + l/2)c. In calculating the structure 
factor, one can group the identical atoms by pairs. From Equation 13-70, 

JV/2 

FJik M = X f J (S)(e 2niihx J+ ky J+ 2 ' ) + e 2*i(-h*j-kyj+i=j+ii2)) (13-79) 
When h = k = 0, the structure factor becomes 

JV/2 

F m (0, 0, /) = X fj(S)[e 2 "^(l + (13-80) 
j= i 

Whenever / is odd, the exponential in the last term becomes equal to - 1, and so the 
scattering amplitude vanishes. Thus, there will be no diffraction in the special case 
h = 0, k = 0, / = odd. If such absences are not sufficient to uniquely determine the 
space group, sometimes a statistical analysis of the pattern of intensities can com- 
pl ete the assi gnment. 

Crystallographic estimation of molecular weight 

Once the lattice and space group are known, it frequently is possible to determine 
the molecular weight of the molecules that compose the crystals. The density of the 
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crystal, p c , can be measured experimentally. Then the weight W of one unit cell can 
be computed as 

W = p c V (13-81) 

where V is the volume of the unit cell determined from the diffraction pattern. 

In general, protein and other macromolecule crystals can be viewed as con- 
taining three components: anhydrous macromolecule (m), free solvent (s), and bound 
water (w). The weight of one unit cell will be the sum of the three components : 

p Q V = p m V m + p w V w + pM (13-82) 

where V refers to the volume of each component, and p is its density. Usually, p s is 
known experimentally, and p w can be taken as the density of pure water. 

We want to compute the weight of macromolecule per unit cell, p m V m . Thus, 
we must eliminate the unknown quantities F w and V s from Equation 13-82. The 
volume of bound water, F w , can be written in terms of the hydration <5 1? in grams 
water per gram macromolecule. 

Kv = S lPn VJp w (13-83) 

Hydration values are known approximately for proteins and nucleic acids (Chapter 
10). The total volume of the unit cell is 



Using Equation 13-83, the volume of free solvent can be written as 

V s =V-V m (l+S lP J Pvi ) (13-85) 

Inserting Equations 13-83 and 13-85 into Equation 13-82, and solving the 
resulting expression for p m V m , we obtain 



m 1 - Ps/Pm + W - PJPJ 



Thus the weight of macromolecule per unit cell {W m ) can be calculated if the anhy- 
drous density (p m ) is known.JLJsually it is a good approximation to equate p" 1 with 
the partial specific volume (V m ) measured for the macromolecule in solution. 

If there is a single macromolecule per unit cell, then the molecular weight is 
just M = N 0 W m where N 0 is Avogadro's number. If the space group indicates n' 
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asymmetric units per unit cell, the molecular weight of an asymmetric unit is 

M = N 0 WJri (13-87) 

Alternative methods for determining the molecular weight of molecules in a crystal 
are discussed by B. W. Matthews (1975). 

Often an estimate of the molecular weight is already available from hydro- 
dynamic measurements or primary structure data. Then ri can be computed from 
a measurement of W m . This value must always be an integer. Therefore, once an 
estimate of ri is available, it can be used to refine the value of the molecular weight. 

Using the space group for information on macromolecule symmetry 

In most cases, the number of molecules per unit cell [ri) is equal to the number of 
asymmetric units {ri). Here we consider the special case where the macromolecule 
is an oligomer of identical subunits. For example, a molecule with five subunits 
might have C 5 symmetry. But this symmetry can never correspond to a symmetry 
element of the space group, because there is no space group with a C 5 rotation axis. 
Therefore, the asymmetric unit must contain all five subunits. 

On the other hand, in many cases, a molecule with C 2 or C 3 rotational symmetry 
crystallizes so that its axis is also a symmetry axis of the motif. Then it is possible 
that the number of asymmetric units per cell will be an integral multiple of the 
number of subunits per unit cell, rather than a multiple of the number of molecules. 
This relationship permits one to infer the presence of rotational axes of symmetry 
in the macromolecule. Note, however, that it is not necessary for all rotation axes of 
a molecule simultaneously to be rotation axes of the crystal. Therefore, the estimate 
of symmetry is a minimal estimate. 

An example is aspartate transcarbamoylase, which was treated in Chapter 2, In 
one crystal form, this enzyme crystallizes in a space group with eight asymmetric 
units per cell, but there are only four molecules per cell. This indicates the presence 
of a twofold rotation axis. In a second crystal form, the space group has six asym- 
metric units per cell, but the cell dimensions allow for only two molecules per cell. 
Thus, a threefold rotation axis exists in the molecule. Because each subunit must be 
an asymmetric object, the only way that both of these axes can exist simultaneously 
is for the molecule to have six (or some integral multiple of six) subunits of each 
type. For aspartate transcarbamoylase, the subunit structure is c 6 r 6 , and a model of 
the symmetric arrangement is shown in Figure 2-49. 



Varying scattering geometry to measure diffraction pattern 

Reciprocal lattice points are precisely those locations in space that satisfy the 
von Laue conditions for scattering. Whenever the crystal and incident beam (s 0 ) are 
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oriented so that the scattering vector S contacts a reciprocal lattice point, diffracted 
intensity is observed along the vector s = AS + s 0 . In making measurements, one 
has the choice of varying the orientations of the crystal, the detector, or the incident 
beam. Usually it is the crystal that is allowed to move. The reciprocal lattice is fixed 
in space for a fixed crystal orientation. If the crystal is rotated through an angle 
about an axis, the reciprocal lattice will rotate the same angle about the same 
laboratory axis. 

For given x-ray wavelength, crystal orientation, and incident x-ray beam, all 
possible scattering vectors extend from the origin to the surface of the sphere of 
reflection (Fig. 13-23a). The sphere can be generated from Figure 13-3a by allowing 
all possible orientations of S. It has a diameter of 2/A because this is the largest 
possible value of |S|, occurring at 6 = 90°. The surface of the sphere passes through 
the origin of the reciprocal lattice. Here, h = k = I = 0, the scattering vector S has 
length zero, and 0 = 0°; all radiation is forward scattered. 

The sphere of reflection will enclose a set of reciprocal lattice points. However, 
diffraction will be observed only when these points intersect the surface of the sphere. 
Clearly, if the reciprocal lattice is literally composed of points, the probability of 
this occurring is infinitesimal. Fortunately, the actual radiation used in diffraction 
experiments is a distribution of wavelengths. This means that a spherical zone of S 
applies, rather than just a surface. Furthermore, in an actual crystal, zones of unit 
cells differ very slightly in orientation. This effect, called mosaicism, means that the 
reciprocal lattice points will be of a finite size. Nevertheless, not many lattice points 
will intersect the surface of the scattering sphere simultaneously (Fig. 13-23a). 



Experimental restrictions on the observation of x-ray diffraction, (a) For a fixed geometry and x-ray 
wavelength, scattering will be observed only when the surface of the sphere of reflection intersects 
reciprocal-lattice points, (b) Even if all possible geometries are sampled, only that portion of the 
reciprocal lattice that lies within a sphere of radius 2/A (the limiting sphere) can be examined. 




• • • • 



Figure 13-23 
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To collect sufficient diffraction data to solve a crystal structure, one must mea- 
sure as many diffracted rays as possible. Therefore, what is usually done is to rotate 
or oscillate the crystal in a systematic way. This causes successive lattice points to 
intersect the scattering sphere and permits the resulting diffraction to be measured 
(Fig. 13-24). Note that it is the discontinuous nature of the reciprocal lattice that 
makes it difficult to collect diffraction data for a three-dimensional crystal. In a 
two-dimensional sample, the reciprocal lattice is a set of lines (Fig. 13-12). Most of 
these will intersect the sphere of reflection somewhere, and so a single sample geom- 
etry can yield many diffraction spots (Fig. 13-11). 





k = 0 plane 



(b) 



Figure 13-24 

The effect on observed diffraction of rotating the sample, (a) One sample orientation where a pair of 
scattering vectors [black arrows) intersect the reciprocal lattice. Note that, if the scattered radiation 
(colored arrows) is viewed as originating from the center of the sphere of reflection, s intersects the same 
reciprocal-lattice point as does S. (b) An alternative geometry, in which only a single reciprocal-lattice 
point is sampled. 



746 



X-RAY CRYSTALLOGRAPHY 



Several methods for collecting scattering data 

The spatial pattern of diffracted rays that emerges as one rotates a crystal is not 
necessarily a simple one. However, proper choices of rotation axes can lead to fairly 
regular patterns of diffracted spots. For example, suppose that the incident beam is 
perpendicular to axis b, and the crystal is rotated about this axis. In a rotation 
camera, a cylinder of film surrounds the sample (Fig. 13-25a). Diffracted rays passing 
through a given k level of the reciprocal lattice (say, h,0J) will all fall on the same 
line of the film. However, the order of spots, as a function of h and / values, is not 
regular, and the overall pattern is quite compressed. A rotation photograph projects 
a whole layer of the reciprocal lattice onto a single line. A typical example is shown 
in Figure 13-25b. 



k 




(b) 

Figure 13-25 

The rotation camera. This camera projects a plane of the reciprocal lattice onto a line of the film, 
(a) Schematic diagram of a rotation camera, (b) Example of a rotation photograph. [From G. H. 
Stout and L. M. Jensen, X-Ray Structure Determination (New York: Macmillan, 1968).] 



13-3 PROPERTIES OF CRYSTALS 



Clearly, what one would like to have is a way of collecting diffraction data 
organized just like the reciprocal lattice. One way to do this is the precession camera. 
In essence, this camera rotates the sample and the film in such a coupled way that 
diffraction spots from all individual lines of the reciprocal lattice appear as properly 
spaced lines on the photographic film. The details of operation of such a camera are 
complex, and the interested reader can find them elsewhere. The results are photo- 
graphs that each show one whole plane of reciprocal space. A set of such photographs 
permits one to reconstruct all accessible data about the diffraction pattern. Figure 
13-26 shows an example. Two-dimensional scanning film densitometers are used to 
convert the x-ray photograph into a series of indexed integrated scattering intensities. 
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Figure 13-26 

A precession photograph. An entire plane of the reciprocal lattice is displayed without distortion. The 
sample is a tetragonal crystal of lysozyme. Note the presence of a fourfold rotation axis and various 
mirror-symmetry planes in the diffraction pattern. [Courtesy of C. C. F. Blake.] 
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An alternative to the precession camera, now in much more common use, is the 
automated diffractometer. Once the crystal class and unit cell of the sample are 
known, its absolute orientation in space can be determined. Then, it is possible to 
predict the sample and detector geometry needed to produce a spot with particular 
indices h, k, i This information is given to a computer, which finds the spot, measures 
the intensity, rotates the sample and detector to where the next spot should be, and 
so on. The x-ray intensity can be measured directly by solid scintillation detectors. 

The limiting sphere of the reciprocal lattice 

The reciprocal lattice, in principle, is infinite. Each of the indices /i, /c, / varies from 
- co to + oc . The Fourier inversion needed to calculate the electron density distribu- 
tion from x-ray structure factors is an infinite sum over all three indices (Eqn. 13-39). 
It would not be practical to collect data over an infinite reciprocal lattice. More 
significantly, it is not even possible, because the finite wavelength of the x rays used 
limits the largest values of the indices h, k, ' that yield diffraction intensity. 

Return to Figure 13-23a and note the position of the sphere of reflection. Rota- 
tion of the crystal about any of the three laboratory axes will bring reciprocal lattice 
points into contact with the spherical surface, but cannot possibly reach any recip- 
rocal lattice points that lie a distance farther from the origin than the sphere surface. 
The longest possible scattering vector has a length 2/A. Thus, even if all possible 
geometries are tried, no reciprocal lattice points farther from the origin than 2/x 
can be sampled. 

This limitation defines a sphere of radius 2/A, centered at the origin (Fig. 13-23b). 
The sphere is precisely twice the diameter of the sphere of reflection. It is called the 
limiting sphere. All reciprocal lattice points contained within the limiting sphere are 
measurable by a proper choice of experimental geometry. But no points outside of 
the limiting sphere can be detected. The only recourse would be to decrease the 
wavelength of the x rays and thus increase the diameter of the limiting sphere. 

Limitations on the resolution of structures calculated 
from x-ray diffraction data 

What is the result of our inability to measure scattering throughout the whole 
reciprocal space? Larger distances in reciprocal space correspond to smaller dis- 
tances within the real crystal lattice. Therefore, with only a finite set of diffraction 
data, the ability to discrimin~ate~fine detaflsof the electron density distribution is lost. 
In short, the resolution of the structure determination is decreased. It is useful to 
examine this statement more quantitatively. 

What fraction of the data within the limiting sphere must be collected in order 
to produce a final structure determined to a given resolution? A vector S in reciprocal 
space is ha* + kb* + /c*. Its length is |S|; the dimensions are A" 1 . Therefore, |S| 
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corresponds to a distance d = l/|S|. One can estimate* that a collection of all diffrac- 
tion data up to a value of |S| ought to contain the information needed to determine a 
structure with a resolution of around 1/|S| A. 

The implications of limited resolution are best seen by a purely theoretical 
example. Figure 13-27a shows a section of p sheet. For simplicity, we shall view this 
as projected into the a-b plane. (See Box 13-5 for a discussion of how a projection is 
carried out mathematically.) The unit cell illustrated in Figure 13-27a repeats to 
form an infinite two-dimensional lattice. The structure factor produced by x-ray 
scattering from this array can be calculated exactly using the two-dimensional analog 
of Equation 13-70: 

F m (h, k)^£ /}(S)e 2 »'<**' + (13.88) 

j 

The indices h and k can be evaluated for any integral values we want, from - x 

to + 00. 

Figure 13-27b shows part of the resulting set of structure factor data. Note that 
this plot illustrates both the phase and the amplitude of the structure factor. Because 
the two-stranded jS sheet projected into two dimensions has a center of symmetry, 
the structure factor is real rather than complex, and the phase term reduces to just 
a sign ( + or — ) as described earlier in the chapter. 

Given a set of x-ray scattering data such as that shown in Figure 13-27b, we 
can calculate the structure that produced it by using Equation 13-39. In two dimen- 
sions, the result is 

p(x,y) = (l/A) £ £ F m (h,k)e- 2 ^ hx+k ^ (13-89) 

h= - co k = — x 

where A is the area of one unit cell. However, in practice, we cannot measure values 
of FJh, k) with h and k extending all the way out to ± oo. Suppose that |S| could be 
measured only out to 1/4 A -1 . This restricts h and k to values that fall within a 
circle of radius |S| = |/ia* + kb*\ drawn about the origin. (This is the innermost circle 
drawn in Fig. 13-27b.) It restricts h and k to values of - 1, 0, and 1. If Equation 13-89 
is used with just these terms, it produces an image of the structure at about 4 A 
resolution. The result (Fig. 13-27c) suggests two strands of peptide, but obscures all 
molecular details. 

Extending the data set used to larger values of |S| produces higher-resolution 
images (Fig. 13-27d,e). Note that in these images some regions of negative electron 
density (dashed contours) are included. These occur because the data set used is 
still finite. A perfect image of the structure can be obtained only when an infinite 
data set is available. Because such a set cannot be obtained in practice, corrections 
are used to compensate for the truncation of the series in Equation 13-89. 



5 From the theory of image formation, if all scattered waves are measured with wavelengths of d A or 
more, one should be able to resolve structural features separated by >0.6d A. In reality, x-ray data are 
not perfect, and a more realistic estimate is d A. 
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Figure 13-27 

Electron density maps as a function of resolution, (a) Two strands of poly-L-alanine antiparallel f$ sheet 
within a two-dimensional unit cell. C 2 and 2j symmetry axes of a planar projection of the structure are 
indicated, (b) Calculated structure factor data for a two-dimensional crystal, formed by projecting the 
structure shown in part a onto the a-b plane. The circles show the data that would be sampled for 
analysis at resolutions of 4 A, 2 A, and 1 A (indicated by increasingly large circles). The filled dots 
indicate F(h, k) > 0; the open dots indicate F(h, k) < 0. The size of each dot is proportional to k)\. 
(c) An electron density map at 4 A resolution, calculated from part b. (d) An electron density map at 
2 A resolution, calculated from part b. (e) An electron density map at 1 A resolution, calculated from 
part b. [After R. D. B. Fraser and T. P. McRae, in Physical Principles and Techniques of Protein 
Chemistry, part A, ed. S. J. Leach (New York: Academic Press, 1969).] 



Experimental limitations on resolution 

The example just discussed illustrates the limitations of reconstructing a structure 
even if perfect data were available. Experimentally, the wavelength of radiation used 
limits sampling of the reciprocal lattice to values of |S| ^ 2/ a. 

Why not always collect data sufficient for structure determination at the highest 
resolution allowed by this limiting sphere? There are three practical considerations. 
A given crystal will always have some disorder; thus, the x-ray data corresponding 
to short distances may be nonexistent. The amount of computation needed to com- 
pute the structure rises sharply with the number of data points. And the number of 
diffraction spots that is equal to the number of reciprocal lattice points contained 
within a sphere of radius |S| grows as the volume of the sphere. 

The number of reciprocal lattice points within a sphere of radius |S| is approxi- 
mately equal to the number n of reciprocal cells contained within the sphere. If K* 
is the volume of one reciprocal lattice cell, and (4/3)7t|S| 3 is the volume of the sphere, 
then 

n - (4/3)n|S| 3 /^* = V(4/3)n/d* (13-90) 

where V is the volume of the real unit cell, and d = |S|" 1 is the resolution. Therefore, 
the number of diffraction spots that must be measured increases as the cube of the 
desired resolution. 

Two factors decrease the minimal number of diffraction spots or reciprocal 
lattice points needed to contain all structural information for a certain resolution. 
As shown in Equation 13-18, the fact that the electron density is real results in a 
center of symmetry for the diffraction pattern : F(h, k, I) = F*( - h, - fc, - /), where the 
asterisk indicates the complex conjugate. Thus only one hemisphere of the limiting 
sphere must be measured. Furthermore; in most crystal classes, there is additional 
symmetry in the diffraction pattern when plotted in reciprocal space (Table 13-1). 

A tetragonal crystal will have a fourfold rotation axis. The diffraction pattern 
of such a crystal is completely defined by only one octant of reciprocal space. Con- 
sider a crystal of cytochrome c in the tetragonal class. The unit-cell dimensions are 
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a = b = 58.5 A and c = 42.3 A. The unit-cell volume is abc = 144,700 A 3 . Equation 
13-90 indicates that, for a limiting sphere sufficient to resolve structure to d A, the 
number of reflections contained is-n = 606,400/</ 3 . Because of the tetragonal class, 
the number of unique diffraction spots is only one-eighth of this: 75,800/rf 3 . In 



Box 13-5 PROJECTIONS OF ELECTRON DENSITY DISTRIBUTION 

Many times it is convenient to work with a projection of the electron density in a plane rather 
than with the entire three-dimensional electron density distribution. Suppose we choose a 
plane perpendicular to an arbitrary direction q. Any vector r to a given position in the crystal 
can be expressed as the sum of a component along q and a component d perpendicular to q: 

r = d + qq 

where q is a unit vector, and q is the magnitude of the projection along q. 
The electron density distribution of the crystal is 

p(r) = J_ x x dSF(S)e- 2 * iS T = dSF(S)e- 2niSd e' 2 ^ s '^ 

Its projection onto a plane perpendicular to q is simply the integral of p(r) over all q: 

Pq(i) = £ dq dSF{S)e- 2 **' d e-w* = dSF(S)e~ 2KiS ' d J* x dqe'W* 

The second integral is just the Dirac delta function, <5(S • q). Therefore, it vanishes unless 
S • q = 0 or, in other words, unless S is in a plane perpendicular to q, whereupon the second 
integral is unity. So, if S q represents all scattering vectors perpendicular to q, then 

p q (d) = j^dS q F(S q )e~ 2 " iS <' d (A) 

The projection integral is carried out only in a plane of reciprocal space perpendicular to the 
projection axis. If an inverse Fourier transform is performed, 

F(S,)« ^ddp q (d)e 2 ^' d (B) 

Equations A and B are quite useful. They imply that, if one measures the x-ray scattering 
in a plane of reciprocal space, F(S q ) 9 the electron density of a projection of the structure onto 
that plane can be computed. Alternatively, any plane in reciprocal space will contain infor- 
mation only about the electron density of the molecule projected onto a plane. It is common 
to choose a projection along crystal axes. For example, suppose we choose q to be the c axis. 
Then, after inserting the von Laue conditions, Equation A becomes 

p(x,y) = (l/A) 1 t F(Kk,0)e- 2 «^ k » 

n=-o& k = — oc 
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practice, this implies that, for 4 A resolution, about 1,200 diffraction spots must be 
measured. This number increases to 9,500 for 2 A resolution, and to 75,800 for 1 A 
resolution. Clearly, for large unit cells such as those found in macromolecular 
crystals, the amount of work needed to improve resolution can be quite formidable. 



Note that this is the projection of the electron density onto a plane perpendicular to the c 
axis. That plane is not necessarily the a-b plane unless the crystal symmetry is such that a 
and b are perpendicular to c. A is the area of the projection of the a-b face of the unit cell 
perpendicular to the c axis, as shown in the figure. 




Unit cell 



Planar projection 



Projections of electron density. A unit cell 
containing a hollow molecular cylinder. 
The molecule projects into a hollow ellipse 
in a plane perpendicular to c. This ellipse 
can be further projected onto a line. 



Linear projection 



With many molecules, properly chosen projections may have an apparent symmetry not 
visible in the structure of the whole. Examination of the x-ray diffraction corresponding to 
such projections often can help simplify the determination of the structure. When crystal- 
lographers show diffraction patterns, they virtually always display data for one plane through 
the reciprocal lattice. Usually, this is a plane in which the index of one reciprocal-cell direction 
is zero — that-isrMfrO-or-Jr, 0,-/~or 0,~/c, /. 

In a comparable way, a row of points of the reciprocal lattice will contain the data 
needed to calculate the projection of the electron density onto a line. That line will be the 
intersection of the planes perpendicular to the two projection directions, as shown in the 
figure. For example, where a. b, and c are all mutually perpendicular, the zero layer line 
(/i,0, 0) describes the projection of the electron density along the a axis. 
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13-4 DETERMINATION OF MOLECULAR STRUCTURE 
BY X-RAY CRYSTALLOGRAPHY 

The phase problem 

We have seen that it is relatively easy to determine the properties of a crystal lattice 
from its measured diffraction pattern. However, the central problem in x-ray crystal- 
lography is to determine p(r), the electron density distribution within the unit cell. 
In principle, we solved this problem with Equation 13-39: 

p( W ) = (l/NV) f I E *Mr»"" a (13-39) 

h= - x *= -x /= -x 

However, an enormous practical deterrent exists. As mentioned earlier (Eqn. 13-12), 
each F(h'k, /) structure factor is a complex number consisting of an amplitude 
\F\ and a phase term e*. Only the square of the amplitude, \F\ 2 , can be observed 
experimentally. The phase angle 4> can have any value between 0 and In. 

In the special case of a crystal with a center of symmetry, the phase is much 
more restricted. For such a crystal p(r) = p(-r). As shown earlier, F(h,k,l) is real; 
the phase can be either 0 or n, and is ± 1. This means that only the sign of each 
term in the Fourier synthesis of the electron density is unknown. However, even in 
these cases, there are 2" possible choices of phase for a set of n diffraction spots. 

For biological samples, the inability to measure phases experimentally poses a 
truly serious problem. Not only is the number of diffraction spots large, but also 
such crystals cannot have centers of symmetry because they contain asymmetric 
carbon atoms. Certain techniques we shall describe can help to infer partial phase 
information. Sometimes chemical insight or other previously known information 
about the molecule must be included in order to assist the structure determination. 
The methods usually are sufficient to permit phase estimates accurate enough to 
compute a three-dimensional structure. It must be borne in mind that occasionally 
the methods described below can converge on an incorrect structure. 



Phases are more Important than amplitudes 

Because one has measured amplitudes but not phases, it is of interest to ask which 
of these two factors is most crucial in establishing the correct structure. This can be 
determined by taking a known structure and calculating the correct structure factors 
|F|'*. If these are then inserted back into Equation 13-90, naturally an accurate image 
of the electron density distribution appears. This is shown for a two-dimensional 
projection of a portion of sheet in Figure 13-27e. 

Suppose instead that all the correct amplitudes are used, but each phase is 
arbitrarily assigned at the same value, 0\ The resulting Fourier synthesis (Fig. 13-28a) 
bears no resemblance to a p sheet. Next, suppose all amplitudes are set at the same 



(a) 



(b) 



Figure 13-28 

Relative importance of intensities and phases in computing an electron density map from diffraction data. 
The sample and data are the same as those shown in Figure 13-27. (a) Correct amplitudes were used in 
this Fourier synthesis, but all phases were set equal to zero, (b) Correct phases were used in this 
Fourier synthesis, but all amplitudes were set equal to the same average value. [After R. D. B. Fraser 
and T. P. MacRae, in Physical Principles and Techniques of Protein Chemistry, part A, ed. S. J. Leach 
(New York: Academic Press, 1969).] 



equal value, \F\ = (J^|F fc | 2 ) 1/2 , where the sum is taken over all the square of all of 
the amplitudes of the diffraction pattern. This corresponds to an average over all 
measured intensities. If these are combined with the correct phases in a Fourier 
synthesis, the result (Fig. 13-28b) clearly has substantial resemblance to a ft sheet. 
Thus, we have the unfortunate situation that the unmeasurable quantities are actu- 
ally more useful than those that can be obtained experimentally. 



General considerations in solving a crystal structure 

The problem of solving a structure starts with an enormous number of unknowns: 
the location of each atom in the unit cell, the type of the atom and therefore the 
expected atomic scattering factor, and the phase associated with each diffraction 
spot. There is also a considerable amount of available data: the diffraction intensities, 
the space group and unit cell of the crystal, and usually a significant amount of 
information about the molecule being examined (for example, partial or complete 
chemical structure, and perhaps even some conformational data). 

The most general goal is to find a structure for the molecule that represents a 
best fit to the available diffraction data and does not violate, without due cause, our 
chemical intuition and the set of available structural data. Putting it this way makes 
it clear that x-ray structure determination is not in practice an absolute technique. 
In most macromolecular structure cases, one must rely on other information besides 
diffraction data. That is, there are not enough pure x-ray data available to establish 
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a unique location and identity for each atom in the structure. Even if all the phases 
of the scattering factors were experimentally measurable, there might not be enough 
information. One must marvel, then, at the courage of the first scientists to tackle 
macromolecular crystal structures. 

Steps in determining the structure of a small molecule 

Here we illustrate some of the procedures used to determine the structure of a small 
molecule. These are not necessarily the most powerful methods currently available, 
but they provide a useful comparison with our later listing of the techniques used 
on large molecules. 

1. One attempts to prepare suitable crystals, and then determines the space 
group and the unit-cell dimensions, and collects a set of amplitude data 
\F 0 (h,k,D\. 

2. One attempts to find the locations of a few of the atoms. This can be done by 
direct methods (see Blundell and Johnson, 1976), or by a search for a few 
heavy atoms using the Patterson function discussed later. 

3. Once the position of a few atoms is known, the contribution F H that these 
atoms make to the total scattering can be calculated by using Equation 13-70: 

F H (h, k, I) = X / J (s)e a "« ta ' + *' +fe ' ) (13-91) 
j 

where the index j runs over all atoms of known position. Note, however, that 
the structure factor observed experimentally (F 0 ) is the sum of contributions 
from the known atoms and from all atoms yet to be found (F u ): 

F 0 (h,kJ) = F H (h,k,l) + F u (h,k,l) (13-92) 

What is crucial, however, is that, because we calculated it, F H {h, k, /) contains 
phase as well as amplitude data. 

4. The phase of F„(h,k,l) can be used in several ways to estimate the phase 
associated with F 0 (h,k,l). Then a Fourier synthesis can be performed to 
compute an estimate of the electron density distribution from the known 
heavy atom positions: 









p(x, y, z)= £ 

h= - 


£ £ \F Q {h,k,l)\e i,t,hk 'e- 2nmx + ky+,z) 

x k~ - x /= - x 


(13-93) 
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Here, the structure factor has been explicitly divided into amplitude and 
phase terms. Note that it is essential to use measured amplitudes. If both 
calculated phases and amplitudes are used in Equation 13-93 (that is, if 
Eqn. 13-91 is simply inserted into Eqn. 13-93), all that can come out for 
p(x, y,z) is precisely the known atom positions that were originally put into 
Equation 13-91 to compute F H (h,kJ). 

5. The electron density distribution calculated by Equation 13-93 with even 
partial phase information will show some definite maxima corresponding to 
the locations of new atoms or groups of atoms. These in turn can be used in 
Equation 13-91 to compute more accurate phases, which then are used in 
Equation 13-93 to compute a new electron density map. This process, called 
a Fourier refinement, continues alternately until all atoms consistent with 
one's original or revised expectations have been found. 

6. The structure at this point is still a very approximate one. The original subset 
of atoms used to start the bootstrap process rolling probably are not placed 
all that precisely. The electron density distribution that finally results is not 
always all that sharp. It usually is impossible to assign precise coordinates 
to all atoms. Furthermore, experimental errors in observed F{h, A\ /) affect the 
data, and these must be dealt with in a systematic way. Thus, the sixth and 
final phase of x-ray structure determination is to allow the molecular struc- 
ture to vary somewhat in an attempt to maximize the agreement between the 
computed structure and the observed data. One way of doing this, called a 
least-squares refinement, is illustrated later. 

In computing p(.x, y,z) from Equation 13-93 in practice, only a finite set of 
values of x 9 y, and z can be used. It is common to compute p for planar sections 
through the crystal (that is, to vary x and y, but leave - constant). Even then, only 
discrete values of x and y are used. From the resulting pattern of density for each 
plane, smooth contours are drawn representing areas of equal density. Usually, one 
interpolates available data to bring this about. Individual two-dimensional sections 
are produced by a computer. These can be drawn on transparent sheets and stacked 
up to give a three-dimensional image of the structure. Figure 13-29 shows an exam- 
ple. Alternatively, computer-generated displays can be viewed on a cathode ray 
tube from any desired perspective. 



Calculating the Patterson function from measured scattering 

The scattering intensities actually measured in an x-ray diffraction experiment are 
given by 

7(S) = |F(S)| 2 = F(S)F*(S) (13-94) 
where the asterisk indicates the complex conjugate. If we knew F(S\ we could 



Figure 13-29 

A three-dimensional electron density map of ribonuclease S. constructed by plotting density 
lucite sheets and then stacking the sheets. [Provided by Frederick Richards.] 



Fourier-transform it to obtain the electron density distribution of the entire crystal. 
If instead we Fourier-transform J(S) directly, the result is called the Patterson 
function : 

P = i(S)e- 2 " iS t dS = dSF{S)F*(S)e- 2 " iS t (13-95) 

This equation is the Fourier transform of the product of two functions, F(S) and 
F*(S). Therefore (by Eqn. 13-52) it is equal to the convolution of the Fourier trans- 
forms of F(S) and F*(S). . 

The Fourier transform of F(S) is 

$*' x dSe- 271,8 r F(S) = p(r) (13-96) 

(as in Eqn. 13-8). What is the Fourier transform of F*(S)? Because p(r) is real, F*(S) 
is just (from Eqn. 13-7) 
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F*(S) = (j^ dre 2 " iSr p(r)\* = ^" 2 - s * >(r) = dre 2 **' 'p(-r) (13-97) 



Therefore, the Fourier transform of F*(S), by analogy to what is shown in Equations 
13-9 through 13-11, is just p(-r). This is the electron density inverted through the 
origin. Thus 



P = pSp(-r) = drp(r)p(u + r) 



(13-98) 



The physical idea behind the convolution integral helps generate some feeling for 
the properties of the Patterson function. Figure 13-30 shows a simple case. It is a 
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Figure 13-30 

The Patterson function for a three-atom structure, (a) Four unit cells of the lattice: the molecule at the 
origin is shaded.— (b) The-same four-unit cells, inverted through the origin: — (c) Constf uctiorfof (He — 
convolution of the arrays shown in parts a and b. The position of each atom in part b is used as the 
origin to lay down an image of the structure in part "a. For example, note that, when atom 2 in part b is 
the origin, atom l in part a is displaced from the origin, but atom 2 in part a winds up at the origin. 
Therefore, an equivalent description of the convolution is simply to lay down successive images of part 
a with each atom in turn at the origin, (d) The overall convolution, adding the various contributions 
shown separately in part c. The numbers by each point indicate the way in which that point arose. For 
example, 1-2 means that an image of atom 2 resulted when the structure was laid down with atom 1 at 
the origin. [After J. P. Glusker and K. N. Trueblood, Crystal Structure Analysis: A Primer (London: 
Oxford Univ. Press, 1972).] 
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two-dimensional crystal containing one three-atom molecule per unit cell. Figure 
13-30a shows the structure of the crystal, and Figure 13-30b shows the crystal inverted 
through the origin. We want the convolution of these two structures. First, consider 
just a single unit cell of the crystal and a single unit cell of the inverted structure. The 
convolution is formed by choosing each atom in the inverted structure one at a time. 
Lay down an image of the structure of the original unit cell, by superimposing the 
origin (lower left corner) of the cell on top of this atom, and weight this image by the 
electron density of the chosen atom. 

Note that, when the origin of the original cell is superimposed on an atom in the 
inverted structure at -r, the corresponding atom at r in the original structure now 
is located at the origin of the inverted cell. Therefore, the convolution can be con- 
structed by ignoring the inverted structure, shifting the contents of a cell to place 
each atom in turn at the origin, and adding up weighted images of the structure 
(Fig. 13-30c). 

There are three atoms in the structure. The image with each atom at the origin 
has three atoms. So, a total of nine atom images will appear in each unit cell of the 
convolution; three fall right at the origin (Fig. 13-30d). In general, for a molecule with 
N atoms in the unit cell, the Patterson function will have N 2 peaks in its unit cell. 
N of these peaks will occur at the origin, and the remaining N{N — 1) somewhere 
within the unit cell. It is clear that the Patterson function will be increasingly cumber- 
some to use or interpret as N grows. 



Periodic repetition of Patterson functions 

The convolution described by Equation 13-98 actually is operated over the whole 
crystal and not over just one unit cell. This fact has a simple consequence. Consider 
a lattice with only a single atom at each unit cell vertex. If one particular atom is used 
to superimpose an image of the whole crystal, the result is to place an atom at every 
vertex of every unit cell Choosing any other atom results in exactly the same image. 

The same argument applies in a molecular crystal. Choose one atom in a par- 
ticular cell to lay down an image. Choose the corresponding atom in a different unit 
cell to lay down an image. The resulting images are coincident, except that the crystal 
is displaced in space by an integral number of unit cells. Thus, like the electron density, 
the Patterson function repeats periodically throughout the crystal. All the information 
of interest can be obtained by concentrating on a single unit cell. With actual data, 
the intensity 1(S) is not a continuous function, but is sampled only at the reciprocal 
lattice points. Thus, by analogy with Equation 13-39, the integral in Equation 13-95 
is replaced by a sum: 



X 






P(x, y,r)= X 

h- - x 


X X \F{h,k,t)\ 2 e- 2nilhx+ky+l:) 

k= - x /= - x 


(13-99) 
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Correspondence of peaks in Patterson function 
and vectors between atoms 



An alternative description of the Patterson function helps give a feeling for the 
structural information it contains. Note first that, if we change the origins used for 
the unit cells, the resulting Patterson function remains unaltered. That function still 
is constructed by placing each atom in turn at the origin of a cell. Hence, all of the 
peaks in the Patterson function must represent internal structural aspects of the 
unit cell. 

Suppose the unit cell has three atoms, located at r 1? r 2 , and r 3 . When an image is 
laid down by shifting the atom at r x to the origin, the peaks in the image are placed at 
Vl _ ri9 at r 2 - r l5 and at r 3 - r x . Therefore, these peaks in the Patterson function 
are simply the vectors from each atom in the structure to atom r v When r 2 is used as 
the origin, we get peaks for vectors from all atoms to r 2 , and so on. So the Patterson 
function is simply the set of all vectors between pairs of atoms in the structure. It is 
clear why this set does not depend on the choice of an origin. Using this physical 
description, the Patterson function can be rewritten as 



p = I X pM x j ~ r k) 



j k 



(13-100) 



where each index ; and k is summed over all atoms in the unit cell with electron 
density p. 

The Patterson map contains more than enough information to determine the 
structure. The problem is that there is no efficient or easy strategy to use this infor- 
mation. Unfortunately, the peaks in the Patterson function are not labeled. There is 
no simple way of deciding which pair of atoms a given vector in the map represents. 
One must find a way to deconvolute the Patterson function to extract the structure. 
If just a few atoms are involved, this can easily be done by brute force. Alternatively, 
if the positions of a few atoms are already known, superposition techniques can be 
used to deconvolute the Patterson function (see Blundell and Johnson, 1976; Stout 
and Jensen, 1968). 



Using Patterson maps to locate heavy atoms in small molecules 

For complex molecules, the difficulty in interpreting the Patterson map is a direct 
consequence of the large number of interatom vectors. Suppose, however, that the 
structure contains two or more heavy atoms per unit cell. The atomic scattering factor 
is proportional to the number z of electrons (Eqn. 13-23). Thus observed inten- 
sities, and the resulting Patterson peaks, from atoms / and ; are proportional to z i z i 
(corresponding to the p ( pj terms in Eqn. 13-100). This means that vectors between 
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pairs of heavy atoms are the most dominant feature of a Patterson map. With a 
limited number of heavy atoms, it usually is possible to find out enough about their 
locations to proceed with further refinement and structural analysis. 

Most space groups contain molecules related by symmetry operations within 
the unit cell. If each molecule has a heavy atom, the Patterson vector between two 
symmetry-related atoms will fall in an easily identifiable region of the Patterson map. 
Consider the example shown in Figure 13-31. This is a monoclinic crystal m space 




(a) 



(b) 



Figure 13-31 

Locating heavy atoms, (a) A unit cell of a crystal in space group PI , . 
There are two molecules per unit cell, with each molecule containing 
one heavy atom, (b) Patterson function calculated for the sample 
shown in part a. Only the heavy atom-heavy atom vectors are shown. 
The Harker section is colored. 



group P2j with two molecules per unit cell. There is a twofold screw axis parallel to 
direction b. If there is a heavy atom at position x'a + y'b + z'c, then the other heavy 
atom must be located at -x'a + (V + l/2)b - z'c. The heavy atom-heavy atom . 
vector must be located at the position 2x'a + (l/2)b + 2z'c in the Patterson map. 
Thus x' and z' can be determined by looking for a peak in the plane of the Patterson 
map at (l/2)b. 

This procedure still leaves y' undetermined. However, for the P2 X space group, 
the problem is not serious. There is no unique origin along the b axis, and therefore 
y' can be chosen to have any arbitrary value. 

Planes or lines where symmetry-related Patterson vectors appear are called 
Harker sections. If one or more Patterson vectors can be assigned by examining these, 
it sometimes is possible to use superposition methods to find others. Each Harker 
section will contain not just the heavy atom-heavy atom vectors, but also all the other 
vectors related by the same symmetry operation. It will not contain any light atom- 
heavy atom vectors except for coincidences. The contrast afforded by the heavy-atom 
pair will be zl versus zf for each light-atom pair. This raises the question of how heavy 
an atom is needed. A general rule of thumb is that zt £ £, zf, where the sum is taken 
over all light atoms. For a typical light atom, z, is 7. Thus a single heavy atom such as 
mercury with z h = 80 could be found in a structure as large as 130 light atoms. It could 
not be found in a Patterson map of a typical protein with 1,000 to 10,000 atoms. 
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Testing agreement between calculated structure and observed data 

How can one tell if a calculated structure is in good agreement with the measured 
x-ray diffraction? The most common measure of the agreement is the residual index R : 

* = l\\ F Ol " l^ca,c||/IN (13-101) 

where |F ca | C | represents structure factors computed from the model of the total struc- 
ture by Equation 13-70. Thus the factor R essentially measures how the observed 
experimental data |F 0 | compare with the data that would be expected for the calcu- 
lated structure, |F calc |. 

If the structure is very approximate, it may be simply a random arrangement of 
atoms of the correct numbers, types, and symmetry within the unit cell. In this case, 
it has been shown that R will be 0.59 for a space group without a center of symmetry, 
and 0.83 if a center is present. A very rough rule suggests that R £ 0.45 means that 
the trial structure is not completely useless; R £ 0.35 means definite convergence on 
the right track; and R ^ 0.25 means that most atoms are correctly placed to within 
the order of 0.1 A. Small organic structures often can be refined to R < 0.05. Protein 
R values usually are large at early stages in the structure determination. This is 
because solvent and thermal motion effects usually are not taken into account until 
later stages. 

Note that an R value of 0.25 actually implies a degree of disagreement between 
observed and calculated amplitudes that would be considered intolerable in most 
techniques. It probably is fair to say that the quantity of x-ray data makes up for the 
lack of quality of individual data points. 

13-5 DETERMINING THE STRUCTURE OF A MACROMOLECULE 
The method of multiple isomorphous replacements 

The methods described earlier for small molecules are not successful with proteins 
or nucleic acids. These large molecules generally do not contain conveniently placed 
heavy atoms. Even where such atoms do occur naturally, the complexity of the 
structure demands different approaches. The steps in a typical macromolecular 
crystallographic study are the following. 

1. One attempts to prepare suitable crystals of the native macromolecule. (This 
is the most difficult and time-consuming stage in protein or nucleic acid 
crystallography. In many cases, macromolecular crystals form that appear 
beautiful morphologically, but they have so much disorder that high-resolu- 
tion diffraction data are unobtainable.) Using the crystals, one determines the 
space group and unit-cell dimensions, and then collects a set of scattering 
amplitude data. 
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2. One attempts to prepare several different heavy-atom isomorphous derivatives. 
These are crystals with the same unit cell, space group, and macromolecular 
structure as the parent crystal, except that one or more heavy atoms have been 
introduced at specific loci. For each derivative, one collects a new set of 
scattering amplitude data. 

3. One attempts to find the locations of the heavy atoms in the crystal. A popu- 
lar way to do this is the difference isomorphous Patterson synthesis we shall 
discuss. 

4. One attempts to refine the positions assigned to heavy atoms, either by use 
of difference Fourier refinement techniques somewhat like those described 
for small molecules, or by more elaborate methods. 

5. By comparing the structure factor data of the parent crystal with the corre- 
sponding data of one or more heavy-atom isomorphous derivatives, it is 
possible to estimate the phases of each F(h, k, I) of the parent crystal. In general, 
the more heavy-atom derivatives available, the more accurate the phase 
estimates will be. 

6. By using the phases estimated for the parent crystal, it often is possible to 
refine the positions of the heavy atoms further using least-square or difference 
Fourier techniques. This procedure in turn leads to better estimates for the 
phases of the parent crystal. 

7. Using the estimated phases and observed amplitudes of each F{h, k, I), one 
calculates an electron density map using Equation 13-93. 

8. A model is built of the electron density map. Usually at this state, only low- 
resolution data (typically 5.5 to 7 A) have been used, and so the map does not 
show well-resolved structural details. Then steps 4 through 8 are repeated 
with data at higher resolution (2.5 to 3 A) until it is possible to construct a 
molecular model. 

9. Sometimes, one attempts to refine the structure. For example, one can calculate 
phases from the atom positions in the molecular model by Equation 13-70 
and use them instead of the phases determined in stages 5 and 6. The refinement 
can involve Fourier or least-squares techniques, and can treat just the x-ray 
data or can also include information about known energetics of protein 
conformation. 

Most of these stages are discussed in-more-detail in-the-foll owing subsections. 



Preparation and properties of macromolecular crystals 

For crystallographic studies on a macromolecule of 50,000 mol wt, one needs crystals 
about 0.3 mm in each dimension. To form these, one generally must prepare a super- 
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saturated solution of the macromolecule and control the rate at which crystal nuclea- 
tion and growth occur. Solubility can be altered by varying the pH, salt concentration, 
types of salts present, and temperature of the solution — or by adding organic solvents. 

One convenient way to control the rate of change of many of these parameters 
is dialysis. Another is vapor diffusion. Here a droplet of macromolecular solution is 
allowed to equilibrate through the vapor phase with a reservoir of solution. If, for 
example, the reservoir has a higher salt concentration than the sample, the result 
will be a gradual removal of solvent from the sample. Further details about these and 
other techniques are given by T. L. Blundell and L. N. Johnson (1976). 

Protein and nucleic acid crystals differ from small-molecule crystals in one 
important aspect. They contain a considerably quantity (typically 50%) of liquid 
solvent. Cases are known in which protein or nucleic acid crystals are more than 
two-thirds solvent by weight. A typical crystal has much less solvent than this, but it 
usually still resembles a two-phase system. A solid phase is composed of individual 
macromolecules that usually are touching each other in only a few places. In between 
is a series of open channels filled with solvent. Figure 13-32 shows an example. 

The larger amount of solvent in crystals offers several advantages. It permits 
small molecules to be diffused into the crystals. As we shall see, this facilitates the 
incorporation of heavy atoms. It allows substrates or ligands to be introduced into a 
preformed crystal and thus makes possible the study of the structure of macromole- 
cule-ligand complexes. Indeed, some enzymes actually are quite active in the crystal- 
line state. Finally, the large amount of solvent present makes it likely that the structure 
determined for the crystalline molecule will closely resemble its structure when free 
in solution. 

One disadvantage arises from the large solvent content. Some of the solvent 
quite close to the macromolecule is well-ordered. It contributes to the observed x-ray 
scattering, and it must be taken into account in solving the structure. On the other 
hand, once this is accomplished, it provides important clues to how macromolecules 
interact with solvent. 



Preparation of isomorphous heavy-atom derivatives 

The multiple isomorphous replacement technique has been used to solve almost all 
protein and nucleic acid structures known to date. It requires a set of three or more 
crystals of the sample: the parent crystal, and at least two other crystals identical in 
space group and molecular structure except for the presence of one or more heavy 
atoms. In general, the heavy atoms either can replace atoms normally present in the 
structure, or can be additions to the structure. We shall restrict our attention to the 
latter case because it is somewhat easier to treat mathematically, but ultimately both 
cases are fairly equivalent. 

One approach to preparing an isomorphous derivative would be to attach 
covalently a heavy metal to the macromolecule in solution, and then to subject it to 
crystallization conditions. In practice, this approach is not necessarily effective. The 
factors that promote formation of good crystals are so fickle that frequently even a 
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small chemical alteration of the structure will either block crystallization or lead to a 
crystal that no longer is isomorphous. Therefore, almost always one starts with a 
preformed crystal of nonmodified macromolecule. Reagents containing heavy atoms 
then are allowed to diffuse in the crystal. This technique provides fair assurance that 
crystal packing and molecular structure remain largely unaltered. Table 13-2 lists 
some of the types of reagents used, and Table 13-3 summarizes the results. 



Table 13-2 

Representative heavy-atom labeling reagents 



Reagent 



Binding sites 



AgN0 3 
Xe 

Kl + l 2 
PCMB: CI— Hg 
Na 2 PtCI 4 
CI— Hg 




COO" 




S0 2 F 



Hg(Ac) 2 

U0 2 (N0 3 ) 2 or U0 2 (Ac) 2 

Mersalyl: HO— Hg— CH 2 CH— CH 2 — NH— pO 
CH3O 



NaOOC— CH, 




SH groups 

Noncovalent 

Tyrosines 

SH groups 

Methionines, histidines, and others 

Active-site serines 

SH groups, histidines 
Carboxyls 

Histidines, SH groups 



Source: After D. Eisenberg, in The Enzymes. 3d ed.. vol. I, ed. P. D. Boyer (New York : Academic 
Press, 1970). 



Figure 13-32 

A section through a crystal of insulin. Each wedge-shapedunitTi onelnonomer. These monomers 
associate into dimers, which in turn aggregate into hexamers. The hexamers pack into the crystal. Note 
the large solvent channels and the relatively few direct contacts between hexamers. All atoms except 
hydrogen are shown. [From T. L. Blundell, D. C Hodgkin, G. G. Dodson, and D. A. Mercola, Adv. 
Protein Chem. 26:279 (1972).] 
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Table 13-3 

Representative protein crystal structures determined with heavy-atom isomorphous derivatives 



Number 



Protein 


Molecular 
weight 


Number of Space Molecules per 
subunits group asymmetric unit 


of heavy 
atoms used 


Resolution 


Metmyoglobin, sperm whale 


I /,oUU 


1 




1 


8 


1.4 A 


Oxyhemoglobin, horse 


o4,5UU 


4 


C2 


1/2 


7 


2.8 A 


Ferricytochrome c, 


12,400 


1 




1 




2.8 A 


horse heart 












2.0 A 


Carboxypeptidase A, beef 


34,600 


1 


P2 1 


1 


5 


a-Chymotrypsin, beef 


25,000 


1 


P2 l 


2 


6 


2.0 A 


Papain 


23,000 


1 


P2 1 2,2 1 


1 


7 


2.8 A 


Nuclease, S. aureus 


16,800 


1 


PA, 


1 


3 


2.8 A 


Lactate dehydrogenase, 


135,000 


4 


7422 


1/4 


5 


2.0 A 


dogfish 












2.0 A 


Lysozyme, hen egg 


14,600 


1 


P4 3 2 t 2 


1 


8 



Source: After D. Eisenberg, in The Enzymes, 3d ed., vol. 1, ed. P. D. Boyer (New York: Academic Press. 1970). 



Once an isomorphous derivative is available, diffraction data are collected from 
it and compared with those from the unmodified crystal. The reciprocal lattice 
dimensions and symmetry should be unaltered, but the observed intensities of some 
of the reflections can change markedly (Fig. 13-33). These differences can make it 
possible to estimate the phases of the observed structure factors. However, it first is 
necessary to locate the heavy atoms; the next few subsections describe this process. 



Structure factors for heavy-atom isomorphous derivatives 

The electron density distribution of the heavy-atom isomorphous derivative is just 
the sum of the electron densities of the parent crystal and of the heavy-atom sub- 
stitutions. Thus the structure factor F ?H of the heavy-atom isomorphous derivative 
must be related to the structure factor F P of the parent crystal and the structure factor 
F H of the heavy atoms along simply by 

F P „(/2, K I) = F*K K I) + F H (K K I) (13-102) 

because the additional scattering in the derivative is due simply to the presence of the 
heavy atoms. 

Note, however, that all three quantities in Equation 13-102 are complex numbers. 
The significance of this equation can best be seen by expressing each number as a 
vector in the complex plane as described in Figure 13-4. To add two complex numbers, 
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Figure 13-33 

Isomorphous replacement. Two precession photographs of tridinic lysozyme crystals are superimposed, 
slightly out of horizontal register. The left spot of each pair is from a native lysozyme crystal; the right 
spot is from a crystal after diffusion of HgBr 2 . This is a photograph of the (0,k,l) plane of the reciprocal 
lattice. Note the differences in intensities. [From R. Dickerson, in The Proteins, 2d ed., vol. 2, ed. H. 
Neurath (New York: Academic Press, 1964).] 



one simply adds the real and imaginary components separately. Therefore, repre- 
sented as vectors, two complex numbers combine just as vectors do, component by 
component. The result is still a vector in the complex plane, as illustrated in Figure 
13-34. One such vector equation holds for each value h, k,l of the structure factors. 

If any two of the vectors shown in Figure 13-34 are known, the third can be 
calculated unambiguously. The principle obstacle in macromolecular crystallography 
is that only the lengths of two of the vectors, |F P | and |F PH |, are directly measureable 
experimentally. 



Location of heavy atoms by a difference Patterson map 

An ordinary Patterson map (Eqn. 13-95) cannot be used to locate the heavy atoms 
in a macromolecular crystal. We showed earlier that the contrast between heavy 



Imaginary 



Figure 13-34 

Structure factors plotted in the complex plane, 
for a parent-crystal diffraction spot, a heavy- 
atom isomorphous addition, and the expected 
derivative diffraction spot. 




Real 



atom-heavy atom vectors and other vectors will be insufficient. However, when 
both an isomorphous heavy-atom derivative and a parent crystal are available, it is 
possible to calculate a difference isomorphous Patterson map between them, using 
the measured structure factor amplitudes \F m (h 9 kJ)\ and \F ? {h,k, /)[. 

A true Patterson map of just heavy-atom vectors would be given by Equation 
13-95 as 

Ph = W) £ £ £ \F^hXl)\ 2 e' u ^^ it} (13-103) 

h = - ao k = — oo 1 = - oo 

we cannot calculate this map directly because \F H (h, k, 1)\ is not experimentally mea- 
surable. However, it turns out that \F H \ often can be approximated fairly well by 

If hI = \\F ml " \FA ( 13 " 104 ) 

Thus we can calculate an estimate of jF„| from the measured amplitudes of the crystal 
and a heavy-atom isomorphous derivative. Then an isomorphous difference Patterson 
function AP is calculated: 

AP-U/K) iff ||Fh,I - |F p ||V 2 "« te+ * +, '> (13-105) 

h = - oo * = - oo 1 = - x 

In an ideal case, it can be shown that this function will display the heavy atom-heavy 
atom vector at one-half the expected intensity plus some contaminating noise due to 
light atom-light atom vectors (see Blundell and Johnson, 1976). 

The accuracy and usefulness of Equation 13-105 depend on the validity of 
Equation 13-104. This in turn depends on relative phases and amplitudes of the three 
structure factors involved. We discuss a particular simplified case in the following 
subsection. Here, note the following observation. When the two vectors F PH and F P 
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in Equation 13-102 are parallel, then the phases cancel, and Equation 13-104 is exact. 
As long as the three vectors are near-parallel, Equation 13-104 should be an excellent 
approximation. The largest values of ||F PH | - \F ? \\ will tend to be those that arise 
when F PH and F P are parallel. Thus, conveniently, the largest terms that enter Equa- 
tion 13-104 will be those most likely to contain good estimates of |F H |. By selectively 
including only the large terms in Equation 13-105, we often can produce an improved 
heavy-atom difference Patterson map. From the heavy atom-heavy atom vectors, we 
can attempt to find the actual heavy-atom locations by the methods described in 
Box 13-6. 

If Equation 13-104 is to be useful, the presence of the heavy atom in the iso- 
morphous derivative must cause a change in scattering intensity sufficient to yield 
measurable differences between |F PH | and |F P |. For example, a single mercury atom 
with 80 electrons will produce an average change of 30% between |F P | and |F PH | in 
a 40,000 d protein. Thus, it is more than sufficient for the calculation of a difference 
Patterson map. 

Using centrosymmetric projections to locate heavy atoms 

A crystal of biological material cannot contain a center of symmetry because the 
molecules contain asymmetric carbon atoms. However, it frequently is possible to 
calculate a centrosymmetric projection (Box 13-5). For example, if a structure has a 
twofold screw or rotation axis, projection onto a plane perpendicular to this axis 
will result in a center of symmetry. In this two-dimensional projection, the phases of 
the structure factor must be either 0 or n, so that all structure factors are either parallel 
or antiparallel vectors. This greatly simplifies the use of Equation 13-102. 

In a centrosymmetric projection, the structure factors for the parent crystal, 
heavy-atom isomorphous derivatives, and heavy atoms must be related by one of 
the arrangements shown in Figure 13-35. So long as F PH and F P point in the same 
direction, it is apparent that |F H | = ||F PH | - |F P ||, which allows |F H | to be calculated 
directly from experimental data. Only in those cases where |F H | is much larger than 
|F P | does Equation 13-104 become incorrect. These cases, which are called crossovers, 
are very rare and do not seriously compromise most heavy-atom isomorphous 
difference Patterson projections. 

Figure 13-36 shows an example of three difference Patterson projections ob- 
tained for heavy-atom derivatives of cytochrome c. These were obtained by using 
Equation 13-105. The two relatively simple maps (Fig. 13-36a,b) result from a Pt 
and a Hg derivative. The more complex map (Fig. 1 3-36c) was obtained from a crystal 
into which both metals had been substituted. This map shows Hg-Pt vectors as 
well as Hg-Hg and Pt-Pt vectors. From these maps, estimates of both the Pt and Hg 
coordinates can be obtained (see Box 13-6). 

In some cases, the heavy-atom positions found from a difference Patterson map 
are used directly to determine preliminary protein phases, as shown in the next sub- 
section. In most cases, they must be refined first. (Refinement techniques are discussed 
in subsequent subsections.) 



Signs Measured amplitude change 
Structure factors F ? F H = |F PH | - \F?\ 





fpH 


















+ 


+ 




— ► 


Fp 


— ► 














+ 


— 


F H 




FpH 
















+ 


— 






F? 
















+ 




4 


Fy 

fpH 






















F* 


Fu 

F* 




















+ 


F PH 


F H 


F H 














— ► 




+ 


F P 


F» 




— ► ' 
















+ 


F f 


F PH 









+ 



+ 



Figure 13-35 

Structure factors in a centrosvmmetric projection. Shown as vectors are all of the possible 
arrangements of parent (P), heavy atom alone (H), and isomorphous derivative (PH) structure 
factors. A vector pointing from left to right is assigned a positive sign (zero phase angle). 



Using heavy-atom positions to estimate phases of the structure factor 

From the coordinates and identity of each known heavy atom, we can compute both 
the phase and amplitude of its contribution to the structure factor, using Equation 
13-70. This computation yields F H . The structure factor of a heavy-atom isomorphous 
derivative, F PH , must be related to that of the parent crystal, F ? , and to F H simply 
by Equation 13-102. 

Once the heavy atom is found, F H is known completely. However, only the 
amplitudes |F PH | and \F P \ can be measured. Using all three quantities, it is possible 
to restrict the phase of F P to only two possibilities (Fig. 13-37a). The possible values 
of-Fp lie on a circle of radius |F P | centered.at the origin. Possible values for F PH will 
lie on a circle of radius |F PH | but, to satisfy Equation 13-102, the center of this circle 
must be displaced from the origin by the known vector F H - Then the two circles inter- 
sect at two points. At each intersection, corresponding to phases of <\> a and 4> b , the 
conditions prescribed by Equation 13-102 are met. 

The most common resolution of the remaining uncertainty in phase is to use a 
second isomorphous heavy-atom derivative. One estimates the position of the heavy 
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(c) 

Figure 13-36 

Difference Patterson maps, calculated for projections of crystals of cytochrome c into the jc-v plane. 
The origin is at the upper left of each map. All maps are drawn to the same scale; contour intervals are 
marked at the lower right-hand corners; the height of the peak at the origin is indicated at the upper 
left-hand corners. The zero contours are dashed. The x and y coordinates are indicated; they run only 
from the origin to one-half the unit-cell dimensions. Single-weight Patterson peaks are shown by X, 
double-weight Patterson peaks by * . (a) A platinum derivative, (b) A mercury derivative, (c) A 
derivative containing both heavy metals; platinum-mercury cross-vectors are shown by # . [From 
R. E. Dickerson et al., J. Mol Biol. 29:77 (1967).] 
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atoms, computes F H , then uses the analog of Equation 13-102: F PH - = F P + F H .. The 
process of selecting a phase for F P is repeated again by comparing |F PH | and |F P |, 
using the known F H - (Fig. 13-37b). In an ideal case, one of the two circle intersections 
will correspond either to <\> a or to <j> b , and the other will be some different value <\> c . 



Box 13-6 AN EXAMPLE OF THE INTERPRETATION 

OF A DIFFERENCE PATTERSON PROJECTION 

The difference Patterson map shown in Figure 13-36a was calculated for a projection into a 
plane perpendicular to the c axis of a tetragonal crystal of cytochrome c(a = b = 58.45 A; 
c = 42.34 A): 

h=-cc Jc=-ac 

where |J> H | and |Fp| are the square roots of the measured intensities of the heavy-atom iso- 
morphous Pt derivative and the parent crystal, respectively. Thus, AP(x, >>) can be calculated 
from data collected for a single layer of the reciprocal lattice. 

The space group of these crystals is PA V The asymmetric unit is one molecule of cyto- 
chrome c. There are four molecules per unit cell, and these are related by a fourfold screw 
axis. The projection of the structure perpendicular to the c axis places all four molecules in 
the a-b plane, where they are now related by a fourfold rotation axis. Because of this axis, 
the two-dimensional structure also has a center of symmetry. 

We can use the symmetry to predict what the difference Patterson map should look like 
for a single heavy atom located at identical positions on each of the four molecules. For 



(-*,->') 
3 



2 



4 



-* a 



1 

(x'y) 



(a) 



(b) 



-> a 



13-5 DETERMINING THE STRUCTURE OF A MACROMOLECULE 

Thus, because F ? can have only one phase, it must uniquely be the angle derived in 
common from the two isomorphous derivatives. 

In real life, the experimental data are not perfect; nor can the heavy atoms be 
located precisely. Therefore, the points of intersection of circles drawn from two 



convenience, we choose the origin of the coordinate system right at the fourfold axis. Then, 
if the position of one heavy atom is xa + yb, the others must be located as shown in part a 
of the figure. The corresponding heavy atom-heavy atom vectors will be 
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plus four heavy-atom self-vectors, which will lie at the origin. 

The resulting difference Patterson map will be that shown in part b of the figure (for the 
relative values of x and y shown in part a of the figure), where the number adjacent to each 
peak gives its relative weight. Notice that the map has the same fourfold rotational symmetry 
as the structure that generated it. Concentrate just on the quadrant at lower right, and com- 
pare the result with Figure 13-36a. Notice the doubly-weighted peak near the vertical axis. 
This peak must correspond to the nearly vertical vector that forms two sides of the square of 
heavy atoms in the structure. The singly-weighted peak (near the lower right of Fig. 13-36a) 
is produced from a diagonal of the square. The hint of a peak near the horizontal axis arises 
from the nearby doubly-weighted vector in the upper right-hand quadrant of the map. Thus, 
a square structure of heavy atoms is fully consistent with the observed difference Patterson 
map. Once the vectors have been assigned, their locations yield the values of x and y, and 
thus the actual heavy-atom positions. 

It would be a useful exercise for the reader to interpret the Hg difference map in Figure 
13-36b and then, using the results of both maps, to attempt to explain the results shown in 
Figure 13-36c for the Hg, Pt double derivative. 
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(a) (b) 
Figure 13-37 

Phase determination by isomorphous replacement Structure factors are plotted in the complex plane, as in 
Figure 13-34. (a) A single heavy- atom derivative. The circle with radius F P represents the parent 
crystal, with measured intensity and unknown phase. The circle with radius F PH represents the 
isomorphous heavy-atom-containing crystal, with measured intensity and unknown phase. The vector 
F H is calculated from the heavy-atom position that had been determined from a difference Patterson 
synthesis. Because F H is calculated, both its phase and its amplitude are known. Equation 13-102 will 
be satisfied by i>, F H , and F PH when F P lies at the origin, but F PH is located at the end of F H as shown. 
Therefore, the two circles are displaced, and they intersect at two positions: A and B. These positions 
define two possible values for the phase of F P : <f> A and <£ b . (b) Inclusion of a second heavy-atom 
derivative. Its scattering amplitude yields a circle {colored) of radius F PH > centered at the end of the 
vector F H -, which is calculated from the known position of the heavy atom. This circle also intersects 
F P circle at two places: B and C. Because one intersection (B) is the same as an intersection found with 
the first heavy atom, the only phase choice for F P consistent with both derivatives is 0 b . [After D. 
Eisenberg, in The Enzymes, 3rd ed., vol. 1, ed. P. D. Boyer (New York: Academic Press, 1970), p. 1.] 



different isomorphous derivatives may not coincide exactly. Then, to resolve ambi- 
guities, it usually is desirable to have additional derivatives to strengthen the accuracy 
of phase assignment and to guard against apparent agreement that is accidental. 
Naturally, the more derivatives available, the more accurately the phase angles are 
likely to be chosen. Statistical procedures for choosing the best phase estimates from 
multiple isomorphous derivatives are described by Blundell and Johnson (1976). 

Once estimates of the phases of F P are available, one can use Equation 13-93 to 
calculate an electron density map of the macromolecule by inserting the measured 
amplitudes |F P (/i, fc, /)| and calculated phases <f> hkt . However, in most cases, this map 
will not be very accurate unless the estimates of the heavy-atom positions are first 
refined. 
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Phase estimates with a center of symmetry 

Suppose one can prepare only a single isomorphous derivative. The prognosis is 
not completely hopeless. In many cases, a projection of the crystal onto a plane will 
have a center of symmetry. The advantages of centrosymmetric projections in cal- 
culating the amplitude of F H were described earlier. Here we show how such projec- 
tions also assist the calculation of the phase of F P . Note that only a single layer of 
reciprocal space is needed to compute the projections, so the structure problem now 
is purely two-dimensional. 

Whenever a center of symmetry exists, the resulting phases can be only 0 or 71, 
and so the only uncertainty remaining for F P is the sign. The measured change in 
amplitude of one diffraction spot due to a heavy-atom isomorphous substitution is 

AF = |F PH | — |F P | (13-106) 

All the possible arrangements of F PH , F P , and F H are shown in Figure 13-35. When 
the sign of AF (measured) is compared with the sign of F H (calculated from the 
known heavy-atom positions), an interesting generalization emerges. Except for two 
of the rare crossover cases, whenever the sign of AF is the same as the sign of F H , 
the sign of F P must be positive (<£ = 0). Whenever AF and F H have opposite signs, 
F P is negative (4> = tt). 

Thus, even with only a single isomorphous derivative, nearly all of the phases 
of a centrosymmetric projection of a structure can be computed correctly. Then the 
electron density of the projection, p(x, y) can be calculated by a Fourier synthesis 
exactly analogous to Equation 13-93. X-ray crystallographers frequently use pro- 
jections because they can be calculated at earlier stages in the analysis, and because 
less computer time is required to do two-dimensional sums than to do three-dimen- 
sional ones. However, bear in mind that a projection does not uniquely define the 
three-dimensional structure that produced it. 

Narrowing heavy-atom positions with parent-crystal phase estimates 

If we knew the phases of each of the diffraction spots of a parent crystal and of a 
heavy-atom isomorphous derivative, we could calculate an electron density map of 
each by using Equation 13-39. However, sometimes it is useful to display just the 
locations of the heavy atoms. This can be done using a difference Fourier synthesis: 

Ap(x,y,z) = p PH - p P 

OC X " X- 

= (vn III 

h = - od Jc = - oc / = - x 

x [\F PH (h,k,l)\e iMhXl) - \F 9 (hXl)\e iMhM ^e- 2nHhx+k ^ U) (13-107) 
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Here each structure factor has been explicitly shown as a phase plus an amplitude. 
The amplitudes, |F PH | and |F P |, are measured. The phases of the parent crystal, 0 P , 
are estimated as shown in the preceding sections and are used for both amplitudes. 

In principle, we could in similar fashion estimate the phases of the derivative, 
0 PH , and use these in Equation 13-107; however, this would lead to problems. 
Fourier syntheses are dominated by phases, not by amplitudes (Fig. 13-28). Calcu- 
lated phases of the derivative would contain a heavily weighted contribution from 
the heavy atoms. The resulting Fourier synthesis would simply give back the same 
heavy-atom positions one started with, and nothing would have been accomplished. 
Even the estimates of the parent-crystal phases, <j> ?9 are heavily contaminated with 
the heavy-atom phases. 

In practice, when several heavy-atom derivatives are available, it is best to use 
parent phases estimated from one or more derivatives to compute the difference 
Fourier to find other derivatives; these are called cross-phase difference Fouriers. 
Figure 13-38 shows an example for the same cytochrome c derivatives discussed 
earlier. The Pt and Hg atoms show up clearly above a weak background. However, 
their apparent positions are not yet the true positions in the crystal. 

A least-squares refinement of a structural model 

There are adjustable parameters used in the calculation of an x-ray scattering pattern. 
In a least-squares refinement, one attempts to find the values for these parameters 
that minimize the difference between observed structure factor amplitudes and those 
calculated from any particular model or technique. We first illustrate this in the 
general case, and then show the specific application to isomorphous heavy atoms. 

The experimental data are measured structure factor amplitudes |F 0 |. The 
calculated |F C | values usually come from Equation 13-70. They are a function of all 
the structural parameters of the tentative model. These are the x, y, and z coordinates 
of each atom, and the atomic number of each atom. 

In addition, for high-resolution structures, there is another effect we must worry 
about. Atoms are not fixed in space, even in a crystal. They are vibrating, and the 
amplitudes will vary for each atom. The x-ray scattering will be an average of the 
position of each atom. It can be shown that, for isotropic thermal motion, the atomic 
scattering factor will have the form / = / 0 £~^ S|2/4 > where S is the scattering vector, 
and /? is related to the mean square amplitude of the atomic vibration, </*>, by 
/? = 87i 2 </i> 2 . This relation introduces another parameter. The thermal factor /? can 
be guessed from knowledge of the atom type, but in the most rigorous structure 
determination it too will be a variable. Furthermore, real vibrations are anisotropic 
and thus must be represented by a thermal ellipsoid defined by six parameters. So a 
total of anywhere from three to nine parameters, P j9 are needed for each atom; a 
very large number {n) of parameters are needed for the entire asymmetric unit. 
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Figure 13-38 

Cross-phase difference Fourier maps, calculated for the same crystals of cytochrome c as those illustrated 
in Figure 13-36. All three maps are on the same scale; contours are indicated at lower right; heights of 
major peaks are indicated at lower left. The origin is at upper left; only one-half a unit cell in each 
direction is shown, (a) Difference map calculated with Pt amplitudes and protein phases determined 
from Hg derivative. A true Pt site is indicated by X, a questionable site by +, and a false site by A- 
(b) Difference map calculated with Hg amplitudes and protein phases calculated for Pt derivatives. The 
mercury site is at lower left, (c) Map for double derivative, using average protein phases from several 
sets of metal derivatives. [From R. E. Dickerson et al., J. Mol Biol 29:77 (1967).] 
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In a least-squares refinement, one wants to adjust these parameters to minimize 
the difference between observed and calculated structure factor amplitudes. In 
practice, the actual quantity minimized is 

D= 1 I I WU\Fo(KkJ)\-\KF c (KkJ)^ (13-108) 

h- - oc fc = - x / = - x 

where W m is a weighting factor measuring one's estimate of the reliability of a given 
experimental or calculated point, and k is a scaling parameter. For each parameter 
P p one establishes the condition dD/dPj = 0. This leads to n equations in the n 
unknown parameters. Solving these equations simultaneously produces the least- 
squares fit. 

For an example of how such a calculation is set up in matrix form, see Section 8- 1 . 
The important thing is that annxn matrix must be inverted. For linear equations, 
a single matrix inversion suffices. However, the equations that result from differen- 
tiating Equation 13-108 are not linear in the unknown parameters. Therefore, an 
iterative technique must be used. This involves inverting an n x n matrix, taking 
the resulting parameters, reinserting them into the equations, and repeating the 
matrix-inversion process. This routine is performed over and over again until the 
parameters converge on values that minimize Z). 



Least-squares refinement of heavy-atom positions 

In the isomorphous replacement technique, |F P | and |F PH | have been measured, F H 
has been calculated from an estimate of the heavy-atom positions, and </> P has been 
calculated as described earlier. If all these results were correct, then F PH , F P , and 
F H would form a triangle as shown in Figure 13-34. However, because of errors, the 
triangle usually is not closed. 

We can calculate the structure factor expected for the heavy-atom derivatives as 

Waic = \F H \e^ + \F P |e*" (13-109) 

To improve the location of the heavy atoms, one attempts to minimize the difference 
between the amplitude of this calculated structure factor and the observed amplitudes. 
The equation used, by analogy to Equation 13-108, is 

— - D = -t I * £~ -^Xf0^pii(-M,-/)| - jfp-HlcaTc,(^M)|] 2 (13-110) 

h = - oc k= - oc / = - x 

This minimization is done by allowing the heavy-atom positions and the thermal 
parameters to vary. 

Such an approach is not necessarily the best one for every crystal, and alternative 
approaches are discussed by Blundell and Johnson (1976). 
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Once the heavy-atom positions have been refined, they are used to calculate 
a final set of phases for the parent crystal. Then, at last, the stage is set for a Fourier 
synthesis of the entire structure using Equation 13-93. 

Anomalous dispersion of heavy atoms 

One additional technique for exploiting the presence of heavy atoms has seen in- 
creasing use in protein and nucleic acid crystallography. This technique is anomalous 
dispersion. It occurs when the frequency of the x rays used falls near an absorption 
frequency of an atom. In practice, the technique is most useful for atoms heavier 
than sulfur. 

Until now we have treated the atomic scattering factor / as a real number. In 
actuality, / is a complex number because the phase shift upon scattering is not neces- 
sarily an integral or half-integral number of oscillations: 

f(S) = MS) + ifo(S) (13-111) 

The term f 0 is significant only when the x-ray frequency is close to an atomic absorp- 
tion frequency. It is related to the extinction coefficient of that particular atomic 
absorption. 

When f(S) was considered to be real, one of the implications was Friedel's law 
From Equation 13- 18b, for parent-crystal atoms unaffected by anomalous dispersion, 
we can write 

\F P (KkJ)\ = \F P (-K -fc, -/)| (13-112) 

However, for crystals containing heavy atoms, this relationship no longer holds. 
The breakdown of Friedel's law can be used in a number of different ways (see Blundell 
and Johnson, 1976). For example, suppose two different x-ray frequencies are used, 
one allowing anomalous dispersion and one not. The difference in scattered intensities 
should represent the anomalous scattering, and this is restricted to the heavy atom. 
Then an analog of the isomorphous methods described above can allow calculation 
of phases. 



interpretation of the electron density map 

Here we describe some typical stages in the solution of the crystal structure of a 
protein (see Fig. 13-39). Several heavy-atom derivatives have been prepared and 
located. Isomorphous replacement has been used to estimate phases for all F P (/i, /c, /), 
and an electron density map has been calculated using these phases and all data 
to a certain resolution. The resolution chosen will be a function of the order of the 
actual crystal and of how isomorphous the derivatives are. Reliable phases are needed 




Figure 13-39 

Protein electron density maps as a function of resolution. The maps are calculated from measured 
intensities and estimated phases. The protein is a diisopropyl fluorophosphate derivative of bovine 
trypsin. The view is down the y axis of the active site. A ball-and-stick model of the final best estimate 
of the structure is repeated in each map; note the phosphate at lower right and the active-site histidine 
at lower center; above these two features is a disulfide bond, (a) A map at 6.0 A resolution, contoured 
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(d) 



from 0.05 e A" 3 in steps of 0.05 e A" 3 (b) A map at 4.5 A resolution, contoured from 0.10 e A" 3 in 
steps of 0.10 e A -3 , (c) A map at 3.0 A resolution, contoured from 0.35 e A" 3 in steps of 0.30 e A" 3 , 
(d) A map at 1.5 A resolution, contoured from 0.50 e A -3 in steps of 0.50 e A" 3 . All maps are 
shown as stereo pairs. [Courtesy of John L. Chambers. For further details, see his unpublished 
Ph.D. thesis, Calif. Institute of Technology, 1977.] 
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to justify the vastly increasing effort of using more and more scattering data in an 
attempt to obtain higher resolution. 

If the result is around a 6 A map, the macromolecule usually appears as a blob 
of electron density; Figure 13-39a shows a typical example. At this resolution, it 
usually is impossible to recognize the polymer chain . backbone for a protein or 
nucleic acid. 

Even at 6 A resolution, however, considerable useful information emerges. One 
can learn a fairly detailed shape, and can spot crevices or subunits; a helices will 
appear as rods. If heavy-atom-labeled ligands or substrates are available, difference 
Fouriers can allow determination of the locations of their binding sites. If these 
results seem reasonable, it usually is worthwhile to attempt to proceed to higher- 
resolution analysis, providing that the quality of the data on available isomorphous 
derivatives justifies this. 

At 3.0 A resolution, it is possible to trace the path of the polymer chain backbone 
(Fig. 13-39c). In a protein, only the large amino acid side chains show up as discrete 
density peaks. It would be difficult to construct a meaningful molecular model at 
this stage. In nucleic acids, double helices will show up readily. 

At 2.5 A resolution, almost all protein side chains are visible. The carbonyl 
group of each peptide shows up as a protrusion from the main chain, and so it is 
possible to fix the orientation of each peptide plane. 

If the amino acid sequence is known, one can begin to construct a model of 
the protein. Various techniques exist for doing this. The simplest is a Richards box, 
which uses a half-silvered mirror to superimpose a wire model of the structure onto 
a pile of lucite sections where the electron density map is plotted. Coordinates for 
atoms then are read off the model. Newer methods use computer searches to trace 
the most likely continuous paths of electron density and fit a peptide chain to these. 
Note that both approaches have the built-in assumption that the geometry of the 
peptide chain (except for dihedral angles) is known. This is a far cry from high- 
resolution small-molecule x-ray crystallography, where one determines bond lengths 
and angles de novo. 

The original fit of the peptide chain to the electron density map at 2.5 to 3.0 A 
will not be very precise. Many groups cannot be centered on the electron density 
peaks that presumably represent them. (See Fig. 13-39c for a typical example.) How- 
ever, the preliminary model now can be used for one or more cycles of refinement. 
For example, in real-space refinement, one adjusts the model to try to minimize the 
difference between p{x, y 9 z) calculated from the x-ray data and p(x, y,z) calculated 
from the model. Alternatively, one can use Fourier refinement or least-squares 
techniques. 

At higher resolution, individual atoms begin to be seen (see Fig. 13-39d for an 
example of a 1.5 A map). Here it is actually possible to identify many amino acid 
side chains directly from the electron density map. In fact, crystal-structure work has 
revealed a number of serious errors in predetermined amino acid sequences. The 
more side chains one can see, the more accurate a model one can build and, in turn, 
the more likely it is that a further improvement in the electron density map can result 
from additional refinement. 
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Energetics of protein conformations in interpretation 
of the electron density map 

A question of serious concern among crystallographers is how much knowledge 
about conformational analysis of proteins should be incorporated into the process 
of solving crystal structures. As shown in Chapter 5, we know with fair likelihood 
what ranges of dihedral angles are preferred by peptides. We know much about the 
forces that govern the interactions of nonbonded residues. Given a trial structure 
determined from a Fourier synthesis, this conformational knowledge could, by an 
energy minimization, be used to compute a structure more consistent with the body 
of acquired thermodynamic information. 

M. Levitt and R. Diamond have shown that alternate cycles of Fourier and 
conformational energy refinement can be synergistic and can lead to convergence 
to a better structure. This is reasonable. The fact that a crystal forms implies that 
it must be in a crystal-wide free energy minimum. Alternatively, one can use con- 
formational energies, not to try to obtain a minimal energy structure, but just as a 
guide on how to shift atoms slightly in the trial structure. A shift that simultaneously 
improves the fit of that atom to the calculated electron density map and also lowers 
the conformational energy is likely to be a step in the right direction. 

Difference Fourier syntheses in studying 
ligand-macromolecule interactions 

Difference Fouriers are extremely useful for comparing two structures. For example, 
the difference Fourier map of an isomorphous derivative and the parent should 
contain just the electron density of the added heavy atoms, as discussed earlier. 
Similarly, difference Fouriers have been used to locate substrate-binding or ligand- 
binding sites on proteins. Here the idea is to measure the diffraction intensities of 
the liganded protein. Then one calculates a difference Fourier using these measured 
intensities and calculated phases of the unliganded protein. The result can be just 
the electron density of the bound ligand, plus any difference in density due to changes 
in structure induced by the ligand binding. Naturally, the technique will work only 
so long as these differences are not too large. 

The difference Fourier technique for locating a bound ligand works well because 
the presence of the ligand changes the phase of most structure factors relatively little. 
We can write (by analogy to Eqn. 13-102) 

T " F?l = F? + Fl (13-113) 

where PL stands for the parent-ligand complex, and F L is the structure factor of 
the ligand. Figure 13-40 shows this vector equation graphically. Here, F P is known, 
and \F PL \ is known. As long as F L < F P and F L < F PL , the possible values of F PL 
must lie within a small range of phase angles. Thus, as a first approximation, the 
parent phase 0 P is a good estimate of 0 PL . 



Imaginary 




Real 



Figure 13-40 

Effect on observed structure factors of an added ligand. The parent-crystal 
structure factor F P is presumed to be known. Then, so long as the ligand 
structure factor \F L \ is small, the phase of F P is a good approximation of 
the phase of the ligand complex, F PL . 

A difference Fourier map thus can be calculated by analogy to Equation 13-107 : 
Ap(x , v,z) = (l/K) f f f [|l r HL(fc,fc.0| - |F«<^fc//)|> , ^• k • l V- a ^ + ^ + fa, 

h = — oo k = - ao I — — oc 

(13-114) 

This map will show peaks that correspond to the position of the bound ligand. It 
also will show adjacent positive and negative regions of electron density that cor- 
respond to the movement of an atoms induced by ligand binding. Figure 13-41 shows 
this schematically in the one-dimensional difference Fourier. 

An explicit justification for the validity of the difference Fourier map in repre- 
senting the structure of the bound ligand can be seen by examining a centrosymmetric 
projection. The true structure of the bound ligand is 

p L ( W ) = (l/K) iff 1^^, fc, / ) | ^-<^^'> e - 2 -<^ - icr - ir> (1M15) 

h - — x k = - x / = - x 

We need to know how well I^l)^ 1 - is approximated in Equation 13-114 by 
\\F plI " i f plk l0L - I n centrosymmetric case as long as \F L \ is small, Equation 13-1 14 
is exact, as you can see by applying the same arguments used in Figure 13-35. For 
the general case, it is known that Equation 13-114 will correctly represent the electron 
density of the ligand, except that the peaks will be only half the correct height, and 
that there will be some noise in the data. 

Much of our knowledge about the structure of enzyme active sites comes from 
difference Fourier calculations on crystals containing bound substrates or bound 
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Figure 13-41 

Using difference Fourier syntheses to study ligand binding. Such syntheses can be used to identify ligand 
binding sites and any conformational changes that accompany addition of the ligand. Shown are 
one-dimensional schematic drawings of the parent electron density map (p P ), the map that would be 
computed by solving the structure of the ligand complex (p PL ), and a difference Fourier (p PL - p P ) that 
could be calculated in a relatively simple fashion (see text). Note that ligand atoms simply lead to 
increased density, whereas atom movements yield adjacent peaks and troughs in the difference Fourier. 
[After T. L. Blundell and L. N. Johnson, Protein Crystallography (London: Academic Press, 1976).] 



inhibitors. The availability of this technique means that, in many cases, the deter- 
mination of a macromolecular crystal structure is not so much the end of a massive 
effort as it is a starting point in the study of macromolecular function. 



Summary 

The x-ray scattering from an atom depends on its position in space and on the number 
of electrons it contains. The x-ray scattering from an array of atoms can be computed 
by summing the contributions of individual atoms. Periodic arrays of identical atoms 
restrict the observation of significant scattered intensity to only a discrete set of 
experimental geometries. Arrays of molecules can be treated in the same way as 
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arrays of atoms. A crystal is a three-dimensional periodic array that consists of a 
unit cell replicated in space. The vertices of the cell define a crystal lattice. The period- 
icity of the array restricts observable scattering to a very limited set of geometries, 
which form the reciprocal lattice of the crystal. 

The x-ray scattering can be described as the Fourier transform of the electron 
density of the object that generated it. Thus, if one could measure both the phase 
and the intensity of the scattered radiation, one could directly perform an inverse 
Fourier transfer and reproduce the structure. Unfortunately, all one can measure is 
the intensity. In principle, the pattern of scattered intensities still contains sufficient 
information to reconstruct the array that generated it. However, this information 
is not as easy to use or interpret. The inverse Fourier transform of the scattered 
intensity is called the Patterson function. It is a map of all interatomic vectors. Thus, 
if the structure contains n atoms per unit cell, there will be n 2 vectors per unit cell of 
the Patterson function. 

The structure of macromolecular crystals usually is solved by the technique of 
multiple isomorphous replacement. Heavy-metal derivatives of a parent crystal are 
prepared, and the scattered intensities of the parent and the derivatives are compared. 
A difference Patterson map calculated directly from the intensities mostly shows 
just heavy atom-heavy atom vectors. This allows a preliminary estimate of the heavy- 
atom positions. Using these positions and the differences between scattered intensities 
of the parent crystal and the derivatives, it is possible to estimate the phase of the 
scattered radiation. Once this estimate is available, the estimates of the heavy-atom 
positions can be made more precise. The procedure is repeated, or other refinement 
techniques are used. Finally, phases are available accurate enough to use in con- 
junction with the measured intensities to compute an image of the structure. 



Problems 

13-1. Calculate the x-ray structure factor of the array of identical atoms shown in Figure 13-42, 
and compare it with the calculations in the text for similar arrays. Extend the result to 
the infinite array. Atoms are shown as circles, lattice points as dots. 

Origin 

• o»o«o»o«o»o» 

a ^ 

Figure 13-42 

Array of atoms for Problem 13-1. 
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13-2. Calculate the x-ray scattering intensity expected from the infinite two-dimensional 
crystal shown schematically in Figure 13-43, where |a| = R, and |b| = 4R. You may 
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assume that the atomic scattering factor of the central atom of each molecule is twice 
that of the other two atoms. Ignore the dependence of / on S. You should be able to 
denonstrate that, if the intensity is plotted on the reciprocal lattice, it is constant for all 
values of h but varies periodically with k 9 such that the strongest intensity is seen for k = 

0,4,8,..., and the weakest for A: = 2, 6, 10 Hint : Calculate the reciprocal lattice ; 

calculate the scattering expected for one molecule placed at the origin: then use the 
principles of convolutions to compute the structure factor of the crystal; finally, square 
the amplitude to calculate the intensity. [This problem was adapted from one suggested 
by Bruno Zimm.] 

13-3. Starting from the vector diagram in Figure 13-34, derive the following expression for 
the difference between the amplitude of the parent crystal (P) and that of a heavy-atom 
isomorphous derivative (PH): 

I^phI " \ F p| = I f h| cos(0 PH - <t> H ) - 2|F P | shr[(</> P - tf> PH )/2] 

Under what conditions can a comparison of the differences in observed intensities be 
used to obtain a good estimate of the heavy-atom structure-factor amplitude \F H \ ? 

13-4. Draw the molecule that would produce the Patterson map shown in Figure 13-44. (As- 
sume that all atoms are equal.) If you can't see how to do this, first choose a few arbitrary 
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Figure 1 3-43 

Cr vsf al array for Problem 13-2. 
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Figure 13-44 

Patterson map for Problem 13-4. 



small molecules and construct their Patterson maps. Multiple- weigh ted peaks are in- 
dicated in the figure; others have a weight of one. 

13-5. A one-dimensional crystal has a unit cell 4 A long. Measured x-ray scattering intensities- 
ignoring the dependence of/(S) on 5— are shown in Table 1 3-4. An isomorphous "heavy"- 
atom derivative can be prepared, in which one atom with an atomic scattering factor of 
5 is added per unit cell; its scattering intensities also are shown in the table. 



Table 13-4 

Measured x-ray scattering 
intensities for problem 13-5 



h 




\F P »(h)\ 2 


0 


49 


144 


1 


5 


20 


2 


25 


0 


3 


5 


20 


4 


49 


144 


5 


5 


20 


6 


25 


0 


7 


5 


20 



a. Try to find the location of the heavy atom by using |F H | s ||F PH | - |^p|| in a difference 
Fourier synthesis. Convince yourself that only the diffraction spots at h = 0, 1, 2, 3 
need be considered. Assume for the moment that all phase terms are + 1, and perform 
the calculation only for x = 0, /J/4, R/2, and 3/1/4, where R is the length of the unit cell. 
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b. Because the result in part a gives two intensity maxima as possible positions for the 
heavy atom, calculate the contribution each makes to the structure factor. Use the 
phases calculated from each position, along with the intensities estimated as in part 
a, to repeat the difference Fourier. Is there any improvement? 

c. Because the information that \F H | = 5 has been given, use the isomorphous replacement 
replacement technique outlined in the text to estimate the phases of |F P (/i)| for the 
two possible positions of the heavy atom. Now, using the criterion that p(x) must be 
real for all x, select among these phases for four acceptable choices, and perform a 
Fourier synthesis of the |F P (/i)| data to yield the structure. Note that each synthesis 
produces the same structure, except for changes in the origin of the unit cell and the 
direction of positive x. 
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